* [PATCH/RFC] MTD: Striping layer core
@ 2006-03-30 7:57 Belyakov, Alexander
2006-03-30 9:06 ` Vitaly Wool
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Belyakov, Alexander @ 2006-03-30 7:57 UTC (permalink / raw)
To: linux-mtd; +Cc: Belyakov, Alexander, Korolev, Alexey, Kutergin, Timofey
Hello again!
As it was promised I have split patch with striping and related stuff
into three parts.
1. Striping core itself (this message)
2. Thread switching workaround ([PATCH/RFC] CFI: Threads switching issue
fix)
3. CFI Common polling thread ([PATCH/RFC] CFI: Common polling thread
feature)
This very message mainly contains striping layer core code. And, Yes, I
have extended explanations on it.
Lets start with striping layer brief description.
1. INTRODUCTION
The basic purpose of striping is to increase performance of memory
(flash chip in our case) operations using two or more independent
devices. Data are split into parts of a user-defined size (interleave
size), and these data chunks are sent to each chip. As system has
several independent chips it is possible to write chunks of data
simultaneously increasing overall operation performance.
One may say that striping is quite similar to already existing in MTD
concatenation layer. That is not true since these layers have some sharp
distinctions. The first one is the purpose. Concatenation only purpose
is to make larger device from several smaller devices. Striping purpose
is to make devices operate faster. Next difference is provided access to
sub-devices. Concatenation layer provides linear access to sub-devices.
Striping provides interleaved access to sub-devices.
For those who are familiar with RAID techniques I can draw an analogy.
Concatenation is like JBOD in the world of hard drives. JBOD means "just
a bunch of disks". Meanwhile striping is like RAID (Redundant Arrays of
Inexpensive Disks) level 0. Note, this analogy is for basic concept
clarification purpose only.
2. IMPLEMENTATION NOTES
First of all striped device is still a MTD device from the users point
of view. Users use usual mtd->* operations when work with them.
Devices representing parts of striped device are called sub-devices.
Whole striping layer can be divided into two functional parts. The first
one is sub-devices interleaving algorithm. The second one is providing
simultaneous operation pieces delivery to independent chips.
2.1. Interleaving algorithm
The data are split into chunks of so called interleave size. And these
chunks are pushed to chip queues. For example, if we have 2 physically
independent flash devices and choose interleaving size of 128 bytes.
Then write operation with 1024 bytes of data is issued by some
application. Interleaving algorithm splits data in 8 chunks and pushes
them into chip queues. Fist chip queue get chunk 1, chunk 3, chunk 5 and
chunk 7. Second chip queue get chunk 2, chunk 4, chunk 6 and chunk 8. In
the best case these chunks will written to flashes in pairs: chunk 1 and
chunk 2 at one time, chunk 3 and chunk 4 next, then chunk 5 and chunk 6,
chunk 7 and chunk 8 are the last.
Next to be mentioned here is erase block construction. The resulting
erase block size is the least common multiple (LCM) of sub-devices erase
sizes multiplied on number of sub-devices to be striped. When one erases
an erase block of the striped device, in general case several physical
erase blocks of sub-devices are erased. The main idea here is the
virtual "merging" of sub-devices erase blocks. In case of striping
different size devices additional increase of striped device erase size
might appear. For example, if we have 3 equal erase size (say, A bytes)
devices we get the erase size of striped device (A*3) bytes in case of
equal size devices. If we have the same 3 devices but total size of one
of them is smaller then we get the erase size of striped device (A*3*2).
Bad block handling is another question here. If one marks striped device
block as bad then all sub-devices blocks representing that striped
device block will be marked as bad despite not all of them are actually
bad.
2.2. Threads
Simultaneous operation means separate threads. Each independent chip
which participates in creation of striped volume has its own worker
thread. Worker threads are created at the stage of striped device
initialization. Each worker thread has its own operation queue and
interleaving algorithm feeds them. Worker threads interact with flash
drivers (CFI, NAND subsystem).
3. POSSIBLE CONFIGURATIONS AND LIMITATIONS
It is possible to stripe devices of the same type. We can't stripe NOR
and NAND, but only NOR and NOR or NAND and NAND. Flashes of the same
type can differ in erase size and total size.
3.1. Number of devices
Theoretically it is possible to stripe any number of sub-devices. But in
case of striping non-power-of-two number of sub-devices one should
always remember that someone even inside MTD code can try to make
alignment check or something else that assumes erase size being
power-of-two number. So be careful.
3.2. Different size sub-devices
Sub-devices can be of different sizes. There is a plain and simple usage
case for that. For example, you have two independent chips in the system
of the same type and total size. But one of those chips contains
bootloader, kernel and rootfs. So you have only part of one chip
available. And it is good idea to stripe one MTD device representing
whole chip and another one representing free partition on chip with
bootloader and kernel. In that case you will not waste the space.
But due to some properties of interleaving algorithm it is very likely
get increased erase size in case of striping thre ore more devices with
different size.
3.3. Different erase size sub-devices
As it was said it is possible to stripe devices with different erase
size (quite uncommon case). In that case LCM of erase sizes multiplied
on number of devices produces erase size of striped volume.
3.4. Different speed devices
Important thing that striping two devices with essentially different
write/erase speed will not give significant performance increase since
striped device will work with the speed of the slowest sub-device.
3.5. XIP
Using striping with XIP will not lead to any performance increase.
4. HOWTO CONFIGURE (WITH EXAMPLES)
There are two possible ways to configure striped device.
4.1. Built-in module
Allow generic configuration of the MTD striped volumes via the kernel
configuration string. The format is as follows:
mtdstripe=<stripedef>[;<stripedef>]
<stripedef> := <stripename>(<interleavesize>):<subdevname>,<subdevname>
Example:
CONFIG_CMDLINE="..........
mtdparts=flash1:512k(blob)ro,2m(kernel)ro,16m(root),16m(vol1),8m(vol2);f
lash2:16m(vol3),8m(vol4)
mtdstripe=stripe1(128):vol1,vol3;stripe2(128):vol2,vol4 ........."
In case of statically kernel link and kernel configuration string
parameters set striping is to be initialized by mphysmap module.
4.2. Standalone module
If you build mtdstripe.ko as a standalone module it is possible to pass
command line to the module via insmod. The format for the command line
is as follows:
insmod mtdstripe.ko byname="<stripedef>[;<stripedef>]"
<stripedef> := <stripename>(<interleavesize>):<subdevname>.<subdevname>
or it is possible to use MTD device number from "cat /proc/mtd" instead
of names:
insmod mtdstripe.ko bynumber="<stripedef>[;<stripedef>]"
<stripedef> :=
<stripename>(<interleavesize>):<subdevnumber>.<subdevnumber>
Examples:
insmod mtdstripe.ko
byname="stripe1(128):vol1.vol3;stripe2(128):vol2.vol4"
insmod mtdstripe.ko bynumber="stripe1(128):3.5;stripe2(128):4.6"
Note: you should use '.' as a delimiter for sub-device names and number
here due to insmod arguments limitation
4.3. How to choose interleave size?
Sub-devices should belong to different (independent) physical flash
chips in order to get performance increase. Interleave size describes
striping granularity and it is very important from performance point of
view. Write operation performance increase should be expected only if
the amount of data to be written larger than interleave size. For
example, if we have 512 bytes interleave size, we see no write speed
boost for files smaller than 512 bytes. File systems have a write buffer
of well known size (let it be 4096 bytes). Thus it is not good idea to
set interleave size larger than 2048 byte if we are striping two flash
chips and going to use the file system on it. For NOR devices the bottom
border for interleave size is defined by flash buffer size (64 bytes,
128 bytes, etc). But such a small values affects read speed on striped
volumes. Read performance decrease on striped volume is due to large
number of read sub-operations. Thus, if you are going to stripe N
devices and launch a file system having write buffer of size B, the
better choice for interleave size is IS = B / N or somewhat smaller, but
not smaller than single flash chip buffer size.
For NAND you should use page size as interleave size value.
5. KNOWN ISSUES AND POSSIBLE SOLUTIONS
In order to provide real simultaneous writes is very important to be
sure that worker thread switches to another while device is flushing
data from buffer to the chip. The original MTD code has an issue with
such a switching. If we have two thread of the same priority, one of
them will monopolize CPU until all the data chunks from its queue are
flushed to the chip. Apparently such a behavior will not gives any
performance increase. Here is an example:
Say we have 2 physically independent flash devices (striping layer has 2
worker threads) with interleaving size of 128 bytes. Write operation
with 1024 bytes of data is issued. Interleaving algorithm splits data in
8 chunks and pushes them into worker thread queues. Fist chip queue get
chunk 1, chunk 3, chunk 5 and chunk 7. Second chip queue get chunk 2,
chunk 4, chunk 6 and chunk 8. At this point both worker threads have the
same priority equal (in simple case) to priority of the caller thread.
Write to flashes begins here.
Worker thread 1 puts chunk 1 to flash 1 buffer and get free time
flushing data to flash. At that free time worker thread 2 should get
control and write chunk 2 to flash 2. But it won't despite worker thread
1 invokes rescheduling. Since switching between two threads with equal
priority has some uncertainty. And data chunks will be written in the
following order:
chunk 1
chunk 3
chunk 5
chunk 7
chunk 2
chunk 4
chunk 6
chunk 8
instead of expected:
chunk 1 chunk 2
chunk 3 chunk 4
chunk 5 chunk 6
chunk 7 chunk 8
It is obvious that one will not get any performance increase in the
first case.
That is not only striping problem, but also a problem of several
instances of file system mounted on different flashes. These file
systems also will not work simultaneously despite they are using
independent chips.
Two of possible solutions are presented in separated mailing threads
accompanying this one.
5.1. Threads priority switching
The suggested in separate message ([PATCH/RFC] CFI: Threads switching
issue fix) solution deals with temporarily threads priority lowering for
time data is being written from chip buffer to media. The main idea here
is to lower priority slightly of the one worker thread before
rescheduling. That stimulates thread switching providing actually
simultaneous writing. After device has completed write operation thread
restores its original priority.
Another modification here is concerned with the split udelay time in
small chunks. Long udelays negatively affects striping performance since
udelay call is represented by loop and can not be interrupted by other
thread. Small udelay chunks provide more accurate and timely thread
switching.
5.2. Common polling thread (CPT)
Common polling thread is presented as new MTD module that is being used
by CFI layer. It creates single polling thread removing rescheduling
problem. Polling for operation completion status is being done in one
thread raising semaphore in worker threads on completion.
The suggested CPT solution can be turned on in kernel configuration
file.
See mailing thread "[PATCH/RFC] CFI: Common polling thread feature".
6. VALIDATION AND PERFORMANCE GAIN
Suggested striping solution has been validated on arm-based platforms
with different types of flash memory (Including NOR, Sibley and NAND
chips). It is stable and shows performance gain. For NOR and Sibley we
saw up to 85% performance gain in case of using two separate flash
devices. Unfortunately we were not able to check striped NAND
performance gain due to our hardware limitations.
7. PATCH
The diff file below is a patch containing MTD Striping layer core (to be
applied on MTD snapshot 20060315).
Kind Regards,
Alexander Belyakov
diff -uNr a/drivers/mtd/Kconfig b/drivers/mtd/Kconfig
--- a/drivers/mtd/Kconfig 2006-03-05 22:07:54.000000000 +0300
+++ b/drivers/mtd/Kconfig 2006-03-28 12:06:45.000000000 +0400
@@ -36,6 +36,57 @@
file system spanning multiple physical flash chips. If unsure,
say 'Y'.
+config MTD_STRIPE
+ tristate "MTD striping support"
+ depends on MTD
+ help
+ Support for stripinging several MTD devices into a single
+ (virtual) one. This allows you to have -for example- a JFFS(2)
+ file system interleaving multiple physical flash chips. If
unsure,
+ say 'Y'.
+
+ If you build mtdstripe.ko as a module it is possible to pass
+ command line to the module via insmod
+
+ The format for the command line is as follows:
+
+ insmod mtdstripe.ko byname="<stripedef>[;<stripedef>]"
+ <stripedef> :=
<stripename>(<interleavesize>):<subdevname>.<subdevname>
+
+ or it is possible to use MTD device number:
+
+ insmod mtdstripe.ko bynumber="<stripedef>[;<stripedef>]"
+ <stripedef> :=
<stripename>(<interleavesize>):<subdevnumber>.<subdevnumber>
+
+ Subdevices should belong to different physical flash chips
+ in order to get performance increase
+
+ Examples:
+
+ insmod mtdstripe.ko
byname="stripe1(128):vol1.vol3;stripe2(128):vol2.vol4"
+ insmod mtdstripe.ko
bynumber="stripe1(128):3.5;stripe2(128):4.6"
+
+ Note: you should use '.' as a delimeter for subdevice names
here
+
+config MTD_CMDLINE_STRIPE
+ bool "Command line stripe configuration parsing"
+ depends on MTD_STRIPE = 'y'
+ ---help---
+ Allow generic configuration of the MTD striped volumes via the
kernel
+ command line.
+
+ The format for the command line is as follows:
+
+ mtdstripe=<stripedef>[;<stripedef>]
+ <stripedef> :=
<stripename>(<interleavesize>):<subdevname>,<subdevname>
+
+ Subdevices should belong to different physical flash chips
+ in order to get performance increase
+
+ Example:
+
+ mtdstripe=stripe1(128):vol1,vol3;stripe2(128):vol2,vol4
+
config MTD_PARTITIONS
bool "MTD partitioning support"
depends on MTD
diff -uNr a/drivers/mtd/Makefile b/drivers/mtd/Makefile
--- a/drivers/mtd/Makefile 2006-03-05 22:07:54.000000000 +0300
+++ b/drivers/mtd/Makefile 2006-03-28 12:10:48.000000000 +0400
@@ -9,6 +9,7 @@
obj-$(CONFIG_MTD) += $(mtd-y)
obj-$(CONFIG_MTD_CONCAT) += mtdconcat.o
+obj-$(CONFIG_MTD_STRIPE) += mtdstripe.o
obj-$(CONFIG_MTD_REDBOOT_PARTS) += redboot.o
obj-$(CONFIG_MTD_CMDLINE_PARTS) += cmdlinepart.o
obj-$(CONFIG_MTD_AFS_PARTS) += afs.o
diff -uNr a/drivers/mtd/maps/mphysmap.c b/drivers/mtd/maps/mphysmap.c
--- a/drivers/mtd/maps/mphysmap.c 2006-03-28 12:08:28.000000000
+0400
+++ b/drivers/mtd/maps/mphysmap.c 2006-03-28 12:10:48.000000000
+0400
@@ -12,6 +12,9 @@
#ifdef CONFIG_MTD_PARTITIONS
#include <linux/mtd/partitions.h>
#endif
+#ifdef CONFIG_MTD_CMDLINE_STRIPE
+#include <linux/mtd/stripe.h>
+#endif
static struct map_info mphysmap_static_maps[] = {
#if CONFIG_MTD_MULTI_PHYSMAP_1_WIDTH
@@ -155,6 +158,15 @@
};
};
up(&map_mutex);
+
+#ifdef CONFIG_MTD_CMDLINE_STRIPE
+#ifndef MODULE
+ if(mtd_stripe_init()) {
+ printk(KERN_WARNING "MTD stripe initialization from cmdline
has failed\n");
+ }
+#endif
+#endif
+
return 0;
}
@@ -162,6 +174,13 @@
static void __exit mphysmap_exit(void)
{
int i;
+
+#ifdef CONFIG_MTD_CMDLINE_STRIPE
+#ifndef MODULE
+ mtd_stripe_exit();
+#endif
+#endif
+
down(&map_mutex);
for (i=0;
i<sizeof(mphysmap_static_maps)/sizeof(mphysmap_static_maps[0]);
diff -uNr a/drivers/mtd/mtdstripe.c b/drivers/mtd/mtdstripe.c
--- a/drivers/mtd/mtdstripe.c 1970-01-01 03:00:00.000000000 +0300
+++ b/drivers/mtd/mtdstripe.c 2006-03-27 20:30:56.000000000 +0400
@@ -0,0 +1,3583 @@
+/*
########################################################################
#################################
+ ### This software program is available to you under a choice of one
of two licenses.
+ ### You may choose to be licensed under either the GNU General
Public License (GPL) Version 2,
+ ### June 1991, available at http://www.fsf.org/copyleft/gpl.html, or
the Intel BSD + Patent License,
+ ### the text of which follows:
+ ###
+ ### Recipient has requested a license and Intel Corporation
("Intel") is willing to grant a
+ ### license for the software entitled MTD stripe middle layer (the
"Software") being provided by
+ ### Intel Corporation.
+ ###
+ ### The following definitions apply to this License:
+ ###
+ ### "Licensed Patents" means patent claims licensable by Intel
Corporation which are necessarily
+ ### infringed by the use or sale of the Software alone or when
combined with the operating system
+ ### referred to below.
+ ### "Recipient" means the party to whom Intel delivers this
Software.
+ ### "Licensee" means Recipient and those third parties that receive
a license to any operating system
+ ### available under the GNU Public License version 2.0 or later.
+ ###
+ ### Copyright (c) 1995-2005 Intel Corporation. All rights reserved.
+ ###
+ ### The license is provided to Recipient and Recipient's Licensees
under the following terms.
+ ###
+ ### Redistribution and use in source and binary forms of the
Software, with or without modification,
+ ### are permitted provided that the following conditions are met:
+ ### Redistributions of source code of the Software may retain the
above copyright notice, this list
+ ### of conditions and the following disclaimer.
+ ###
+ ### Redistributions in binary form of the Software may reproduce the
above copyright notice,
+ ### this list of conditions and the following disclaimer in the
documentation and/or other materials
+ ### provided with the distribution.
+ ###
+ ### Neither the name of Intel Corporation nor the names of its
contributors shall be used to endorse
+ ### or promote products derived from this Software without specific
prior written permission.
+ ###
+ ### Intel hereby grants Recipient and Licensees a non-exclusive,
worldwide, royalty-free patent licens
+ ### e under Licensed Patents to make, use, sell, offer to sell,
import and otherwise transfer the
+ ### Software, if any, in source code and object code form. This
license shall include changes to
+ ### the Software that are error corrections or other minor changes to
the Software that do not add
+ ### functionality or features when the Software is incorporated in
any version of a operating system
+ ### that has been distributed under the GNU General Public License
2.0 or later. This patent license
+ ### shall apply to the combination of the Software and any operating
system licensed under the
+ ### GNU Public License version 2.0 or later if, at the time Intel
provides the Software to Recipient,
+ ### such addition of the Software to the then publicly available
versions of such operating system
+ ### available under the GNU Public License version 2.0 or later
(whether in gold, beta or alpha form)
+ ### causes such combination to be covered by the Licensed Patents.
The patent license shall not apply
+ ### to any other combinations which include the Software. No hardware
per se is licensed hereunder.
+ ###
+ ### THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
+ ### IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND
+ ### FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
SHALL INTEL OR ITS CONTRIBUTORS BE
+ ### LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES
+ ### (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA,
+ ### OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN
+ ### CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT
+ ### OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE."
+ ###
+
########################################################################
################################### */
+
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+
+#include <linux/mtd/mtd.h>
+#ifdef STANDALONE
+#include "stripe.h"
+#else
+#include <linux/mtd/stripe.h>
+#endif
+
+#ifdef CONFIG_MTD_CMDLINE_STRIPE
+#define CMDLINE_PARSER_STRIPE
+#else
+#ifdef MODULE
+#define CMDLINE_PARSER_STRIPE
+#endif
+#endif
+
+#ifdef MODULE
+static char *byname = NULL;
+static char *bynumber = NULL;
+MODULE_PARM(byname,"s");
+MODULE_PARM_DESC(byname,"Striping configuration by MTD device name");
+MODULE_PARM(bynumber,"s");
+MODULE_PARM_DESC(bynumber,"Striping configuration by MTD device
number");
+#endif
+
+extern struct semaphore mtd_table_mutex;
+extern struct mtd_info *mtd_table[];
+
+#ifdef CMDLINE_PARSER_STRIPE
+static char *cmdline;
+static struct mtd_stripe_info info; /* mtd stripe info head */
+#endif
+
+/*
+ * Striped device structure:
+ * Subdev points to an array of pointers to struct mtd_info objects
+ * which is allocated along with this structure
+ *
+ */
+struct mtd_stripe {
+ struct mtd_info mtd;
+ int num_subdev;
+ u_int32_t interleave_size;
+ u_int32_t *subdev_last_offset;
+ struct mtd_sw_thread_info *sw_threads;
+ struct mtd_info **subdev;
+};
+
+/* This structure is used for stripe_erase and stripe_lock/unlock
methods
+ * and contains erase regions for striped devices
+ */
+struct mtd_stripe_erase_bounds {
+ int need_erase;
+ u_int32_t addr;
+ u_int32_t len;
+};
+
+/* Write/erase thread info structure
+ */
+struct mtd_sw_thread_info {
+ struct task_struct *thread;
+ struct mtd_info *subdev; /* corresponding subdevice pointer */
+ int sw_thread; /* continue operations flag */
+
+ /* wait-for-data semaphore,
+ * up by stripe_write/erase (stripe_stop_write_thread),
+ * down by stripe_write_thread
+ */
+ struct semaphore sw_thread_wait;
+
+ /* start/stop semaphore,
+ * up by stripe_write_thread,
+ * down by stripe_start/stop_write_thread
+ */
+ struct semaphore sw_thread_startstop;
+
+ struct list_head list; /* head of the operation list */
+ spinlock_t list_lock; /* lock to remove race conditions
+ * while adding/removing operations
+ * to/from the list */
+};
+
+/* Single suboperation structure
+ */
+struct subop {
+ u_int32_t ofs; /* offset of write/erase operation */
+ u_int32_t len; /* length of the data to be
written/erased */
+ u_char *buf; /* buffer with data to be written or
poiner
+ * to original erase_info structure
+ * in case of erase operation */
+ u_char *eccbuf; /* buffer with FS provided oob data.
+ * used for stripe_write_ecc operation
+ * NOTE: stripe_write_oob() still uses
u_char *buf member */
+};
+
+/* Suboperation array structure
+ */
+struct subop_struct {
+ struct list_head list; /* suboperation array queue */
+
+ u_int32_t ops_num; /* number of suboperations in the array
*/
+ u_int32_t ops_num_max; /* maximum allowed number of
suboperations */
+ struct subop *ops_array; /* suboperations array */
+};
+
+/* Operation codes */
+#define MTD_STRIPE_OPCODE_READ 0x1
+#define MTD_STRIPE_OPCODE_WRITE 0x2
+#define MTD_STRIPE_OPCODE_READ_ECC 0x3
+#define MTD_STRIPE_OPCODE_WRITE_ECC 0x4
+#define MTD_STRIPE_OPCODE_WRITE_OOB 0x5
+#define MTD_STRIPE_OPCODE_ERASE 0x6
+
+/* Stripe operation structure
+ */
+struct mtd_stripe_op {
+ struct list_head list; /* per thread (device) queue */
+
+ char opcode; /* operation code */
+ int caller_id; /* reserved for thread ID issued this operation */
+ int op_prio; /* original operation prioriry */
+
+ struct semaphore sem; /* operation completed semaphore */
+ struct subop_struct subops; /* suboperation structure */
+
+ int status; /* operation completed status */
+ u_int32_t fail_addr; /* fail address (for erase operation) */
+ u_char state; /* state (for erase operation) */
+};
+
+#define SIZEOF_STRUCT_MTD_STRIPE_OP(num_ops) \
+ ((sizeof(struct mtd_stripe_op) + (num_ops) * sizeof(struct
subop)))
+
+#define SIZEOF_STRUCT_MTD_STRIPE_SUBOP(num_ops) \
+ ((sizeof(struct subop_struct) + (num_ops) * sizeof(struct
subop)))
+
+/*
+ * how to calculate the size required for the above structure,
+ * including the pointer array subdev points to:
+ */
+#define SIZEOF_STRUCT_MTD_STRIPE(num_subdev) \
+ ((sizeof(struct mtd_stripe) + (num_subdev) * sizeof(struct
mtd_info *) \
+ + (num_subdev) * sizeof(u_int32_t) \
+ + (num_subdev) * sizeof(struct mtd_sw_thread_info)))
+
+/*
+ * Given a pointer to the MTD object in the mtd_stripe structure,
+ * we can retrieve the pointer to that structure with this macro.
+ */
+#define STRIPE(x) ((struct mtd_stripe *)(x))
+
+/* Forward functions declaration
+ */
+static int stripe_dev_erase(struct mtd_info *mtd, struct erase_info
*erase);
+
+/*
+ * Miscelaneus support routines
+ */
+
+/*
+ * searches for least common multiple of a and b
+ * returns: LCM or 0 in case of error
+ */
+u_int32_t
+lcm(u_int32_t a, u_int32_t b)
+{
+ u_int32_t lcm;
+ u_int32_t t1 = a;
+ u_int32_t t2 = b;
+
+ if(a <= 0 || b <= 0)
+ {
+ lcm = 0;
+ printk(KERN_ERR "lcm(): wrong arguments\n");
+ }
+ else if(a == b)
+ {
+ /* trivial case */
+ lcm = a;
+ }
+ else
+ {
+ do
+ {
+ lcm = a;
+ a = b;
+ b = lcm - a*(lcm/a);
+ }
+ while(b!=0);
+
+ if(t1 % a)
+ lcm = (t2 / a) * t1;
+ else
+ lcm = (t1 / a) * t2;
+ }
+
+ return lcm;
+} /* int lcm(int a, int b) */
+
+u_int32_t last_offset(struct mtd_stripe *stripe, int subdev_num);
+
+/*
+ * Calculates last_offset for specific striped subdevice
+ * NOTE: subdev array MUST be sorted
+ * by subdevice size (from the smallest to the largest)
+ */
+u_int32_t
+last_offset(struct mtd_stripe *stripe, int subdev_num)
+{
+ u_int32_t offset = 0;
+
+ /* Interleave block count for previous subdevice in the array */
+ u_int32_t prev_dev_size_n = 0;
+
+ /* Current subdevice interleaved block count */
+ u_int32_t curr_size_n = stripe->subdev[subdev_num]->size /
stripe->interleave_size;
+
+ int i;
+
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ struct mtd_info *subdev = stripe->subdev[i];
+ /* subdevice interleaved block count */
+ u_int32_t size_n = subdev->size / stripe->interleave_size;
+
+ if(i < subdev_num)
+ {
+ if(size_n < curr_size_n)
+ {
+ offset += (size_n - prev_dev_size_n) *
(stripe->num_subdev - i);
+ prev_dev_size_n = size_n;
+ }
+ else
+ {
+ offset += (size_n - prev_dev_size_n - 1) *
(stripe->num_subdev - i) + 1;
+ prev_dev_size_n = size_n - 1;
+ }
+ }
+ else if (i == subdev_num)
+ {
+ offset += (size_n - prev_dev_size_n - 1) *
(stripe->num_subdev - i) + 1;
+ break;
+ }
+ }
+
+ return (offset * stripe->interleave_size);
+} /* u_int32_t last_offset(struct mtd_stripe *stripe, int subdev_num)
*/
+
+/* this routine returns oobavail size based on oobfree array
+ * since original mtd_info->oobavail field seems to be zeroed by
unknown reason
+ */
+int stripe_get_oobavail(struct mtd_info *mtd)
+{
+ int oobavail = 0;
+ uint32_t oobfree_max_num = 8; /* array size defined in mtd-abi.h */
+ int i;
+
+ for(i = 0; i < oobfree_max_num; i++)
+ {
+ if(mtd->oobinfo.oobfree[i][1])
+ oobavail += mtd->oobinfo.oobfree[i][1];
+ }
+
+ return oobavail;
+}
+
+/* routine merges subdevs oobinfo into new mtd device oobinfo
+ * this should be made after subdevices sorting done for proper eccpos
and oobfree positioning
+ *
+ * returns: 0 - success */
+int stripe_merge_oobinfo(struct mtd_info *mtd, struct mtd_info
*subdev[], int num_devs)
+{
+ int ret = 0;
+ int i, j;
+ uint32_t eccpos_max_num = sizeof(mtd->oobinfo.eccpos) /
sizeof(uint32_t);
+ uint32_t eccpos_counter = 0;
+ uint32_t oobfree_max_num = 8; /* array size defined in mtd-abi.h */
+ uint32_t oobfree_counter = 0;
+
+ if(mtd->type != MTD_NANDFLASH)
+ return 0;
+
+ mtd->oobinfo.useecc = subdev[0]->oobinfo.useecc;
+ mtd->oobinfo.eccbytes = subdev[0]->oobinfo.eccbytes;
+ for(i = 1; i < num_devs; i++)
+ {
+ if(mtd->oobinfo.useecc != subdev[i]->oobinfo.useecc ||
+ mtd->oobinfo.eccbytes != subdev[i]->oobinfo.eccbytes)
+ {
+ printk(KERN_ERR "stripe_merge_oobinfo(): oobinfo parameters
is not compatible for all subdevices\n");
+ return -EINVAL;
+ }
+ }
+
+ mtd->oobinfo.eccbytes *= num_devs;
+
+ /* drop old oobavail value */
+ mtd->oobavail = 0;
+
+ /* merge oobfree space positions */
+ for(i = 0; i < num_devs; i++)
+ {
+ for(j = 0; j < oobfree_max_num; j++)
+ {
+ if(subdev[i]->oobinfo.oobfree[j][1])
+ {
+ if(oobfree_counter >= oobfree_max_num)
+ break;
+
+ mtd->oobinfo.oobfree[oobfree_counter][0] =
subdev[i]->oobinfo.oobfree[j][0] +
+ i *
subdev[i]->oobsize;
+ mtd->oobinfo.oobfree[oobfree_counter][1] =
subdev[i]->oobinfo.oobfree[j][1];
+
+ mtd->oobavail += subdev[i]->oobinfo.oobfree[j][1];
+ oobfree_counter++;
+ }
+ }
+ }
+
+ /* merge ecc positions */
+ for(i = 0; i < num_devs; i++)
+ {
+ for(j = 0; j < eccpos_max_num; j++)
+ {
+ if(subdev[i]->oobinfo.eccpos[j])
+ {
+ if(eccpos_counter >= eccpos_max_num)
+ {
+ printk(KERN_ERR "stripe_merge_oobinfo(): eccpos
merge error\n");
+ return -EINVAL;
+ }
+
mtd->oobinfo.eccpos[eccpos_counter]=subdev[i]->oobinfo.eccpos[j] + i *
subdev[i]->oobsize;
+ eccpos_counter++;
+ }
+ }
+ }
+
+ return ret;
+}
+
+/* End of support routines */
+
+/* Multithreading support routines */
+
+/* Write to flash thread */
+static void
+stripe_write_thread(void *arg)
+{
+ struct mtd_sw_thread_info* info = (struct mtd_sw_thread_info*)arg;
+ struct mtd_stripe_op* op;
+ struct subop_struct* subops;
+ u_int32_t retsize;
+ int err;
+
+ int i;
+ struct list_head *pos;
+
+ /* erase operation stuff */
+ struct erase_info erase; /* local copy */
+ struct erase_info *instr; /* pointer to original */
+
+ info->thread = current;
+ up(&info->sw_thread_startstop);
+
+ while(info->sw_thread)
+ {
+ /* wait for downcoming write/erase operation */
+ down(&info->sw_thread_wait);
+
+ /* issue operation to the device and remove it from the list
afterwards*/
+ spin_lock(&info->list_lock);
+ if(!list_empty(&info->list))
+ {
+ op = list_entry(info->list.next,struct mtd_stripe_op, list);
+ }
+ else
+ {
+ /* no operation in queue but sw_thread_wait has been rised.
+ * it means stripe_stop_write_thread() has been called
+ */
+ op = NULL;
+ }
+ spin_unlock(&info->list_lock);
+
+ /* leave main thread loop if no ops */
+ if(!op)
+ break;
+
+ err = 0;
+ op->status = 0;
+
+ switch(op->opcode)
+ {
+ case MTD_STRIPE_OPCODE_WRITE:
+ case MTD_STRIPE_OPCODE_WRITE_OOB:
+ /* proceed with list head first */
+ subops = &op->subops;
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ if(op->opcode == MTD_STRIPE_OPCODE_WRITE)
+ err = info->subdev->write(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
subops->ops_array[i].buf);
+ else
+ err = info->subdev->write_oob(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
subops->ops_array[i].buf);
+
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: write operation
failed %d\n",err);
+ break;
+ }
+ }
+
+ if(!op->status)
+ {
+ /* now proceed each list element except head */
+ list_for_each(pos, &op->subops.list)
+ {
+ subops = list_entry(pos, struct subop_struct,
list);
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ if(op->opcode == MTD_STRIPE_OPCODE_WRITE)
+ err = info->subdev->write(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
subops->ops_array[i].buf);
+ else
+ err =
info->subdev->write_oob(info->subdev, subops->ops_array[i].ofs,
subops->ops_array[i].len, &retsize, subops->ops_array[i].buf);
+
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: write
operation failed %d\n",err);
+ break;
+ }
+ }
+
+ if(op->status)
+ break;
+ }
+ }
+ break;
+
+ case MTD_STRIPE_OPCODE_ERASE:
+ subops = &op->subops;
+ instr = (struct erase_info *)subops->ops_array[0].buf;
+
+ /* make a local copy of original erase instruction to
avoid modifying the caller's struct */
+ erase = *instr;
+ erase.addr = subops->ops_array[0].ofs;
+ erase.len = subops->ops_array[0].len;
+
+ if ((err = stripe_dev_erase(info->subdev, &erase)))
+ {
+ /* sanity check: should never happen since
+ * block alignment has been checked early in
stripe_erase() */
+
+ if(erase.fail_addr != 0xffffffff)
+ /* For now this adddres shows address
+ * at failed subdevice,but not at "super" device
*/
+ op->fail_addr = erase.fail_addr;
+ }
+
+ op->status = err;
+ op->state = erase.state;
+ break;
+
+ case MTD_STRIPE_OPCODE_WRITE_ECC:
+ /* proceed with list head first */
+ subops = &op->subops;
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ err = info->subdev->write_ecc(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf,
+
subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: write operation
failed %d\n",err);
+ break;
+ }
+ }
+
+ if(!op->status)
+ {
+ /* now proceed each list element except head */
+ list_for_each(pos, &op->subops.list)
+ {
+ subops = list_entry(pos, struct subop_struct,
list);
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ err = info->subdev->write_ecc(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf,
+
subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: write
operation failed %d\n",err);
+ break;
+ }
+ }
+
+ if(op->status)
+ break;
+ }
+ }
+ break;
+
+ case MTD_STRIPE_OPCODE_READ_ECC:
+ case MTD_STRIPE_OPCODE_READ:
+ /* proceed with list head first */
+ subops = &op->subops;
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ if(op->opcode == MTD_STRIPE_OPCODE_READ_ECC)
+ {
+ err = info->subdev->read_ecc(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf,
+
subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
+ }
+ else
+ {
+ err = info->subdev->read(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf);
+ }
+
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: read operation
failed %d\n",err);
+ break;
+ }
+ }
+
+ if(!op->status)
+ {
+ /* now proceed each list element except head */
+ list_for_each(pos, &op->subops.list)
+ {
+ subops = list_entry(pos, struct subop_struct,
list);
+
+ for(i = 0; i < subops->ops_num; i++)
+ {
+ if(op->opcode == MTD_STRIPE_OPCODE_READ_ECC)
+ {
+ err =
info->subdev->read_ecc(info->subdev, subops->ops_array[i].ofs,
subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf,
+
subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
+ }
+ else
+ {
+ err = info->subdev->read(info->subdev,
subops->ops_array[i].ofs, subops->ops_array[i].len,
+ &retsize,
subops->ops_array[i].buf);
+ }
+
+ if(err)
+ {
+ op->status = -EINVAL;
+ printk(KERN_ERR "mtd_stripe: read
operation failed %d\n",err);
+ break;
+ }
+ }
+
+ if(op->status)
+ break;
+ }
+ }
+
+ break;
+
+ default:
+ /* unknown operation code */
+ printk(KERN_ERR "mtd_stripe: invalid operation code %d",
op->opcode);
+ op->status = -EINVAL;
+ break;
+ };
+
+ /* remove issued operation from the list */
+ spin_lock(&info->list_lock);
+ list_del(&op->list);
+ spin_unlock(&info->list_lock);
+
+ /* raise semaphore to let stripe_write() or stripe_erase()
continue */
+ up(&op->sem);
+ }
+
+ info->thread = NULL;
+ up(&info->sw_thread_startstop);
+}
+
+/* Launches write to flash thread */
+int
+stripe_start_write_thread(struct mtd_sw_thread_info* info, struct
mtd_info *device)
+{
+ pid_t pid;
+ int ret = 0;
+
+ if(info->thread)
+ BUG();
+
+ info->subdev = device; /* set the
pointer to corresponding device */
+
+ init_MUTEX_LOCKED(&info->sw_thread_startstop); /* init
start/stop semaphore */
+ info->sw_thread = 1; /* set continue
thread flag */
+ init_MUTEX_LOCKED(&info->sw_thread_wait); /* init "wait for data"
semaphore */
+
+ INIT_LIST_HEAD(&info->list); /* initialize
operation list head */
+
+ spin_lock_init(&info->list_lock); /* init list lock */
+
+ pid = kernel_thread((int (*)(void *))stripe_write_thread, info,
CLONE_KERNEL); /* flags (3rd arg) TBD */
+ if (pid < 0)
+ {
+ printk(KERN_ERR "fork failed for MTD stripe thread: %d\n",
-pid);
+ ret = pid;
+ }
+ else
+ {
+ /* wait thread started */
+ DEBUG(MTD_DEBUG_LEVEL1, "MTD stripe: write thread has pid %d\n",
pid);
+ down(&info->sw_thread_startstop);
+ }
+
+ return ret;
+}
+
+/* Complete write to flash thread */
+void
+stripe_stop_write_thread(struct mtd_sw_thread_info* info)
+{
+ if(info->thread)
+ {
+ info->sw_thread = 0; /* drop thread flag */
+ up(&info->sw_thread_wait); /* let the thread
complete */
+ down(&info->sw_thread_startstop); /* wait for thread
completion */
+ DEBUG(MTD_DEBUG_LEVEL1, "MTD stripe: writing thread has been
stopped\n");
+ }
+}
+
+/* Updates write/erase thread priority to max value
+ * based on operations in the queue
+ */
+void
+stripe_set_write_thread_prio(struct mtd_sw_thread_info* info)
+{
+ struct mtd_stripe_op *op;
+ int oldnice, newnice;
+ struct list_head *pos;
+
+ newnice = oldnice = info->thread->static_prio - MAX_RT_PRIO - 20;
+
+ spin_lock(&info->list_lock);
+ list_for_each(pos, &info->list)
+ {
+ op = list_entry(pos, struct mtd_stripe_op, list);
+ newnice = (op->op_prio < newnice) ? op->op_prio : newnice;
+ }
+ spin_unlock(&info->list_lock);
+
+ newnice = (newnice < -20) ? -20 : newnice;
+
+ if(oldnice != newnice)
+ set_user_nice(info->thread, newnice);
+}
+
+/* add sub operation into the array
+ op - pointer to the operation structure
+ ofs - operation offset within subdevice
+ len - data to be written/erased
+ buf - pointer to the buffer with data to be written (NULL is erase
operation)
+
+ returns: 0 - success
+*/
+static inline int
+stripe_add_subop(struct mtd_stripe_op *op, u_int32_t ofs, u_int32_t
len, const u_char *buf, const u_char *eccbuf)
+{
+ u_int32_t size; /* number of items in
the new array (if any) */
+ struct subop_struct *subop;
+
+ if(!op)
+ BUG(); /* error */
+
+ /* get tail list element or head */
+ subop = list_entry(op->subops.list.prev, struct subop_struct,
list);
+
+ /* check if current suboperation array is already filled or not */
+ if(subop->ops_num >= subop->ops_num_max)
+ {
+ /* array is full. allocate new one and add to list */
+ size = SIZEOF_STRUCT_MTD_STRIPE_SUBOP(op->subops.ops_num_max);
+ subop = kmalloc(size, GFP_KERNEL);
+ if(!subop)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(subop, 0, size);
+ subop->ops_num = 0;
+ subop->ops_num_max = op->subops.ops_num_max;
+ subop->ops_array = (struct subop *)(subop + 1);
+
+ list_add_tail(&subop->list, &op->subops.list);
+ }
+
+ subop->ops_array[subop->ops_num].ofs = ofs;
+ subop->ops_array[subop->ops_num].len = len;
+ subop->ops_array[subop->ops_num].buf = (u_char *)buf;
+ subop->ops_array[subop->ops_num].eccbuf = (u_char *)eccbuf;
+
+ subop->ops_num++; /* increase stored suboperations counter */
+
+ return 0;
+}
+
+/* deallocates memory allocated by stripe_add_subop routine */
+static void
+stripe_destroy_op(struct mtd_stripe_op *op)
+{
+ struct subop_struct *subop;
+
+ while(!list_empty(&op->subops.list))
+ {
+ subop = list_entry(op->subops.list.next,struct subop_struct,
list);
+ list_del(&subop->list);
+ kfree(subop);
+ }
+}
+
+/* adds new operation to the thread queue and unlock wait semaphore for
specific thread */
+static void
+stripe_add_op(struct mtd_sw_thread_info* info, struct mtd_stripe_op*
op)
+{
+ if(!info || !op)
+ BUG();
+
+ spin_lock(&info->list_lock);
+ list_add_tail(&op->list, &info->list);
+ spin_unlock(&info->list_lock);
+}
+
+/* End of multithreading support routines */
+
+
+/*
+ * MTD methods which look up the relevant subdevice, translate the
+ * effective address and pass through to the subdevice.
+ */
+
+
+/* sychroneous read from striped volume */
+static int
+stripe_read_sync(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf)
+{
+ u_int32_t from_loc = (u_int32_t)from; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+ size_t retsize; /* data read/written from/to
subdev (bytes) */
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_sync(): offset = 0x%08x, size
= %d\n", from_loc, len);
+
+ /* Check whole striped device bounds here */
+ if(from_loc + len > mtd->size)
+ {
+ return err;
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_sync(): device = %d, offset =
0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
+ err =
stripe->subdev[subdev_number]->read(stripe->subdev[subdev_number],
subdev_offset_low, subdev_len, &retsize, buf);
+ if(!err)
+ {
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_sync(): device = %d, offset
= 0x%08x, len = %d\n", subdev_number, subdev_offset *
stripe->interleave_size, subdev_len);
+ err =
stripe->subdev[subdev_number]->read(stripe->subdev[subdev_number],
subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf);
+ if(err)
+ break;
+
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_sync(): read %d bytes\n",
*retlen);
+ return err;
+}
+
+
+/* asychroneous read from striped volume */
+static int
+stripe_read_async(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf)
+{
+ u_int32_t from_loc = (u_int32_t)from; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_async(): offset = 0x%08x, size
= %d\n", from_loc, len);
+
+ /* Check whole striped device bounds here */
+ if(from_loc + len > mtd->size)
+ {
+ return err;
+ }
+
+ /* allocate memory for multithread operations */
+ queue_size = len / stripe->interleave_size / stripe->num_subdev +
1; /* default queue size. could be set to predefined value */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_READ;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* asynch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_async(): device = %d, offset =
0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
subdev_len, buf, NULL);
+ if(!err)
+ {
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_async(): device = %d,
offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
stripe->interleave_size, subdev_len);
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset *
stripe->interleave_size, subdev_len, buf, NULL);
+ if(err)
+ break;
+
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ /* Push operation into the corresponding threads queue and rise
semaphores */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ err = ops[i].status;
+ }
+
+ /* Deallocate all memory before exit */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_async(): read %d bytes\n",
*retlen);
+ return err;
+}
+
+
+static int
+stripe_read(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf)
+{
+ int err;
+ if(mtd->type == MTD_NANDFLASH)
+ err = stripe_read_async(mtd, from, len, retlen, buf);
+ else
+ err = stripe_read_sync(mtd, from, len, retlen, buf);
+
+ return err;
+}
+
+
+static int
+stripe_write(struct mtd_info *mtd, loff_t to, size_t len,
+ size_t * retlen, const u_char * buf)
+{
+ u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned block */
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write(): offset = 0x%08x, size =
%d\n", to_loc, len);
+
+ /* check if no data is going to be written */
+ if(!len)
+ return 0;
+
+ /* Check whole striped device bounds here */
+ if(to_loc + len > mtd->size)
+ return err;
+
+ /* allocate memory for multithread operations */
+ queue_size = len / stripe->interleave_size / stripe->num_subdev +
1; /* default queue size. could be set to predefined value */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_WRITE;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(to_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
- 1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
+ subdev_number = (to_loc / stripe->interleave_size) % dev_count;
+ }
+
+ subdev_offset_low = to_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
subdev_len, buf, NULL);
+ if(!err)
+ {
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(to_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset *
stripe->interleave_size, subdev_len, buf, NULL);
+ if(err)
+ break;
+
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ if(to_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ /* Push operation into the corresponding threads queue and rise
semaphores */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ err = ops[i].status;
+ }
+
+ /* Deallocate all memory before exit */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write(): written %d bytes\n",
*retlen);
+ return err;
+}
+
+
+/* synchroneous ecc read from striped volume */
+static int
+stripe_read_ecc_sync(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf, u_char * eccbuf,
+ struct nand_oobinfo *oobsel)
+{
+ u_int32_t from_loc = (u_int32_t)from; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+ size_t retsize; /* data read/written from/to
subdev (bytes) */
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_sync(): offset = 0x%08x,
size = %d\n", from_loc, len);
+
+ if(oobsel != NULL)
+ {
+ /* check if oobinfo is has been chandes by FS */
+ if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
+ {
+ printk(KERN_ERR "stripe_read_ecc_sync(): oobinfo has been
changed by FS (not supported yet)\n");
+ return err;
+ }
+ }
+
+ /* Check whole striped device bounds here */
+ if(from_loc + len > mtd->size)
+ {
+ return err;
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_sync(): device = %d,
offset = 0x%08x, len = %d\n", subdev_number, subdev_offset_low,
subdev_len);
+ err =
stripe->subdev[subdev_number]->read_ecc(stripe->subdev[subdev_number],
subdev_offset_low, subdev_len, &retsize, buf, eccbuf,
&stripe->subdev[subdev_number]->oobinfo);
+ if(!err)
+ {
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_sync(): device = %d,
offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
stripe->interleave_size, subdev_len);
+ err =
stripe->subdev[subdev_number]->read_ecc(stripe->subdev[subdev_number],
subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf,
eccbuf, &stripe->subdev[subdev_number]->oobinfo);
+ if(err)
+ break;
+
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(from + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_sync(): read %d bytes\n",
*retlen);
+ return err;
+}
+
+
+/* asynchroneous ecc read from striped volume */
+static int
+stripe_read_ecc_async(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf, u_char * eccbuf,
+ struct nand_oobinfo *oobsel)
+{
+ u_int32_t from_loc = (u_int32_t)from; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_async(): offset = 0x%08x,
size = %d\n", from_loc, len);
+
+ if(oobsel != NULL)
+ {
+ /* check if oobinfo is has been chandes by FS */
+ if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
+ {
+ printk(KERN_ERR "stripe_read_ecc_async(): oobinfo has been
changed by FS (not supported yet)\n");
+ return err;
+ }
+ }
+
+ /* Check whole striped device bounds here */
+ if(from_loc + len > mtd->size)
+ {
+ return err;
+ }
+
+ /* allocate memory for multithread operations */
+ queue_size = len / stripe->interleave_size / stripe->num_subdev +
1; /* default queue size. could be set to predefined value */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_READ_ECC;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Issue read operation here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_async(): device = %d,
offset = 0x%08x, len = %d\n", subdev_number, subdev_offset_low,
subdev_len);
+
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
subdev_len, buf, eccbuf);
+ if(!err)
+ {
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(eccbuf)
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(from_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Issue read operation here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_async(): device = %d,
offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
stripe->interleave_size, subdev_len);
+
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset *
stripe->interleave_size, subdev_len, buf, eccbuf);
+ if(err)
+ break;
+
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(eccbuf)
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(from + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ /* Push operation into the corresponding threads queue and rise
semaphores */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ err = ops[i].status;
+ }
+
+ /* Deallocate all memory before exit */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_async(): read %d bytes\n",
*retlen);
+ return err;
+}
+
+
+static int
+stripe_read_ecc(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf, u_char * eccbuf,
+ struct nand_oobinfo *oobsel)
+{
+ int err;
+ if(mtd->type == MTD_NANDFLASH)
+ err = stripe_read_ecc_async(mtd, from, len, retlen, buf, eccbuf,
oobsel);
+ else
+ err = stripe_read_ecc_sync(mtd, from, len, retlen, buf, eccbuf,
oobsel);
+
+ return err;
+}
+
+
+static int
+stripe_write_ecc(struct mtd_info *mtd, loff_t to, size_t len,
+ size_t * retlen, const u_char * buf, u_char * eccbuf,
+ struct nand_oobinfo *oobsel)
+{
+ u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned block */
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_ecc(): offset = 0x%08x, size
= %d\n", to_loc, len);
+
+ if(oobsel != NULL)
+ {
+ /* check if oobinfo is has been chandes by FS */
+ if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
+ {
+ printk(KERN_ERR "stripe_write_ecc(): oobinfo has been
changed by FS (not supported yet)\n");
+ return err;
+ }
+ }
+
+ /* check if no data is going to be written */
+ if(!len)
+ return 0;
+
+ /* Check whole striped device bounds here */
+ if(to_loc + len > mtd->size)
+ return err;
+
+ /* allocate memory for multithread operations */
+ queue_size = len / stripe->interleave_size / stripe->num_subdev +
1; /* default queue size */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_WRITE_ECC;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(to_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
- 1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
+ subdev_number = (to_loc / stripe->interleave_size) % dev_count;
+ }
+
+ subdev_offset_low = to_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
subdev_len, buf, eccbuf);
+ if(!err)
+ {
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(eccbuf)
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(to_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset *
stripe->interleave_size, subdev_len, buf, eccbuf);
+ if(err)
+ break;
+
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+ if(eccbuf)
+ eccbuf += stripe->subdev[subdev_number]->oobavail;
+
+ if(to_loc + *retlen >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ /* Push operation into the corresponding threads queue and rise
semaphores */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ err = ops[i].status;
+ }
+
+ /* Deallocate all memory before exit */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_ecc(): written %d bytes\n",
*retlen);
+ return err;
+}
+
+
+static int
+stripe_read_oob(struct mtd_info *mtd, loff_t from, size_t len,
+ size_t * retlen, u_char * buf)
+{
+ u_int32_t from_loc = (u_int32_t)from; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+ size_t retsize; /* data read/written from/to
subdev (bytes) */
+
+ u_int32_t subdev_oobavail = stripe->subdev[0]->oobsize;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_oob(): offset = 0x%08x, size =
%d\n", from_loc, len);
+
+ /* Check whole striped device bounds here */
+ if(from_loc + len > mtd->size)
+ {
+ return err;
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % subdev_oobavail;
+ subdev_len = (len_left < (subdev_oobavail - subdev_offset_low)) ?
len_left : (subdev_oobavail - subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_oob(): device = %d, offset =
0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
+ err =
stripe->subdev[subdev_number]->read_oob(stripe->subdev[subdev_number],
subdev_offset_low, subdev_len, &retsize, buf);
+ if(!err)
+ {
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ /* increase flash offset by interleave size since oob blocks
+ * aligned with page size (i.e. interleave size) */
+ from_loc += stripe->interleave_size;
+
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < subdev_oobavail) ? len_left :
subdev_oobavail;
+
+ /* Synch read here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_oob(): device = %d, offset
= 0x%08x, len = %d\n", subdev_number, subdev_offset *
stripe->interleave_size, subdev_len);
+ err =
stripe->subdev[subdev_number]->read_oob(stripe->subdev[subdev_number],
subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf);
+ if(err)
+ break;
+
+ *retlen += retsize;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ /* increase flash offset by interleave size since oob blocks
+ * aligned with page size (i.e. interleave size) */
+ from_loc += stripe->interleave_size;
+
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_oob(): read %d bytes\n",
*retlen);
+ return err;
+}
+
+static int
+stripe_write_oob(struct mtd_info *mtd, loff_t to, size_t len,
+ size_t *retlen, const u_char * buf)
+{
+ u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned block */
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to read/write
left (bytes) */
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ //u_int32_t subdev_oobavail = stripe->subdev[0]->oobavail;
+ u_int32_t subdev_oobavail = stripe->subdev[0]->oobsize;
+
+ *retlen = 0;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_oob(): offset = 0x%08x, size
= %d\n", to_loc, len);
+
+ /* check if no data is going to be written */
+ if(!len)
+ return 0;
+
+ /* Check whole striped device bounds here */
+ if(to_loc + len > mtd->size)
+ return err;
+
+ /* allocate memory for multithread operations */
+ queue_size = len / subdev_oobavail / stripe->num_subdev + 1;
/* default queue size. could be set to predefined value */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "stripe_write_oob(): memory allocation
error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_WRITE_OOB;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(to_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
- 1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
+ subdev_number = (to_loc / stripe->interleave_size) % dev_count;
+ }
+
+ subdev_offset_low = to_loc % subdev_oobavail;
+ subdev_len = (len_left < (subdev_oobavail - subdev_offset_low)) ?
len_left : (subdev_oobavail - subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
subdev_len, buf, NULL);
+
+ if(!err)
+ {
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ /* increase flash offset by interleave size since oob blocks
+ * aligned with page size (i.e. interleave size) */
+ to_loc += stripe->interleave_size;
+
+ if(to_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < subdev_oobavail) ? len_left :
subdev_oobavail;
+
+ /* Add suboperation to queue here */
+ err = stripe_add_subop(&ops[subdev_number], subdev_offset *
stripe->interleave_size, subdev_len, buf, NULL);
+ if(err)
+ break;
+
+ *retlen += subdev_len;
+ len_left -= subdev_len;
+ buf += subdev_len;
+
+ /* increase flash offset by interleave size since oob blocks
+ * aligned with page size (i.e. interleave size) */
+ to_loc += stripe->interleave_size;
+
+ if(to_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ /* Push operation into the corresponding threads queue and rise
semaphores */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ err = ops[i].status;
+ }
+
+ /* Deallocate all memory before exit */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_oob(): written %d bytes\n",
*retlen);
+ return err;
+}
+
+/* this routine aimed to support striping on NOR_ECC
+ * it has been taken from cfi_cmdset_0001.c
+ */
+static int
+stripe_writev (struct mtd_info *mtd, const struct kvec *vecs, unsigned
long count,
+ loff_t to, size_t * retlen)
+{
+ int i, page, len, total_len, ret = 0, written = 0, cnt = 0,
towrite;
+ u_char *bufstart;
+ char* data_poi;
+ char* data_buf;
+ loff_t write_offset;
+ int rl_wr;
+
+ u_int32_t pagesize;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "==> stripe_writev()\n");
+
+#ifdef MTD_PROGRAM_REGIONS
+ /* Montavista patch for Sibley support detected */
+ if(mtd->flags & MTD_PROGRAM_REGIONS)
+ {
+ pagesize = MTD_PROGREGION_SIZE(mtd);
+ }
+ else if(mtd->flags & MTD_ECC)
+ {
+ pagesize = mtd->eccsize;
+ }
+ else
+ {
+ printk(KERN_ERR "stripe_writev() has been called for device
without MTD_PROGRAM_REGIONS or MTD_ECC set\n");
+ return -EINVAL;
+ }
+#else
+ if(mtd->flags & MTD_ECC)
+ {
+ pagesize = mtd->eccsize;
+ }
+ else
+ {
+ printk(KERN_ERR "stripe_writev() has been called for device
without MTD_ECC set\n");
+ return -EINVAL;
+ }
+#endif
+
+ data_buf = kmalloc(pagesize, GFP_KERNEL);
+
+ /* Preset written len for early exit */
+ *retlen = 0;
+
+ /* Calculate total length of data */
+ total_len = 0;
+ for (i = 0; i < count; i++)
+ total_len += (int) vecs[i].iov_len;
+
+ /* check if no data is going to be written */
+ if(!total_len)
+ {
+ kfree(data_buf);
+ return 0;
+ }
+
+ /* Do not allow write past end of page */
+ if ((to + total_len) > mtd->size) {
+ DEBUG (MTD_DEBUG_LEVEL0, "stripe_writev(): Attempted write past
end of device\n");
+ kfree(data_buf);
+ return -EINVAL;
+ }
+
+ /* Setup start page */
+ page = ((int) to) / pagesize;
+ towrite = (page + 1) * pagesize - to; /* rest of the page */
+ write_offset = to;
+ written = 0;
+ /* Loop until all iovecs' data has been written */
+ len = 0;
+ while (len < total_len) {
+ bufstart = (u_char *)vecs->iov_base;
+ bufstart += written;
+ data_poi = bufstart;
+
+ /* If the given tuple is >= reet of page then
+ * write it out from the iov
+ */
+ if ( (vecs->iov_len-written) >= towrite) { /* The fastest
case is to write data by int * blocksize */
+ ret = mtd->write(mtd, write_offset, towrite, &rl_wr,
data_poi);
+ if(ret)
+ break;
+ len += towrite;
+ page ++;
+ write_offset = page * pagesize;
+ towrite = pagesize;
+ written += towrite;
+ if(vecs->iov_len == written) {
+ vecs ++;
+ written = 0;
+ }
+ }
+ else
+ {
+ cnt = 0;
+ while(cnt < towrite ) {
+ data_buf[cnt++] = ((u_char *)
vecs->iov_base)[written++];
+ if(vecs->iov_len == written )
+ {
+ if((cnt+len) == total_len )
+ break;
+ vecs ++;
+ written = 0;
+ }
+ }
+ data_poi = data_buf;
+ ret = mtd->write(mtd, write_offset, cnt, &rl_wr, data_poi);
+ if (ret)
+ break;
+ len += cnt;
+ page ++;
+ write_offset = page * pagesize;
+ towrite = pagesize;
+ }
+ }
+
+ if(retlen)
+ *retlen = len;
+ kfree(data_buf);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_writev()\n");
+
+ return ret;
+}
+
+
+static int
+stripe_writev_ecc (struct mtd_info *mtd, const struct kvec *vecs,
unsigned long count,
+ loff_t to, size_t * retlen, u_char *eccbuf, struct
nand_oobinfo *oobsel)
+{
+ int i, page, len, total_len, ret = 0, written = 0, cnt = 0,
towrite;
+ u_char *bufstart;
+ char* data_poi;
+ char* data_buf;
+ loff_t write_offset;
+ data_buf = kmalloc(mtd->oobblock, GFP_KERNEL);
+ int rl_wr;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "==> stripe_writev_ecc()\n");
+
+ if(oobsel != NULL)
+ {
+ /* check if oobinfo is has been chandes by FS */
+ if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
+ {
+ printk(KERN_ERR "stripe_writev_ecc(): oobinfo has been
changed by FS (not supported yet)\n");
+ kfree(data_buf);
+ return -EINVAL;
+ }
+ }
+
+ if(!(mtd->flags & MTD_ECC))
+ {
+ printk(KERN_ERR "stripe_writev_ecc() has been called for device
without MTD_ECC set\n");
+ kfree(data_buf);
+ return -EINVAL;
+ }
+
+ /* Preset written len for early exit */
+ *retlen = 0;
+
+ /* Calculate total length of data */
+ total_len = 0;
+ for (i = 0; i < count; i++)
+ total_len += (int) vecs[i].iov_len;
+
+ /* check if no data is going to be written */
+ if(!total_len)
+ {
+ kfree(data_buf);
+ return 0;
+ }
+
+ /* Do not allow write past end of page */
+ if ((to + total_len) > mtd->size) {
+ DEBUG (MTD_DEBUG_LEVEL0, "stripe_writev_ecc(): Attempted write
past end of device\n");
+ kfree(data_buf);
+ return -EINVAL;
+ }
+
+ /* Check "to" and "len" alignment here */
+ /* NOTE: can't use if(to & (mtd->ooblock - 1)) alignment check here
since
+ * mtd->oobblock can be not-power-of-two number */
+ if((((int) to) % mtd->oobblock) || (total_len % mtd->oobblock))
+ {
+ printk(KERN_ERR "stripe_writev_ecc(): Attempted write not
aligned data!\n");
+ kfree(data_buf);
+ return -EINVAL;
+ }
+
+ /* Setup start page. Notaligned data is not allowed for write_ecc.
*/
+ page = ((int) to) / mtd->oobblock;
+ towrite = (page + 1) * mtd->oobblock - to; /* aligned with
oobblock */
+ write_offset = to;
+ written = 0;
+ /* Loop until all iovecs' data has been written */
+ len = 0;
+ while (len < total_len) {
+ bufstart = (u_char *)vecs->iov_base;
+ bufstart += written;
+ data_poi = bufstart;
+
+ /* If the given tuple is >= reet of page then
+ * write it out from the iov
+ */
+ if ( (vecs->iov_len-written) >= towrite) { /* The fastest
case is to write data by int * blocksize */
+ ret = mtd->write_ecc(mtd, write_offset, towrite, &rl_wr,
data_poi, eccbuf, oobsel);
+ if(ret)
+ break;
+ len += rl_wr;
+ page ++;
+ write_offset = page * mtd->oobblock;
+ towrite = mtd->oobblock;
+ written += towrite;
+ if(vecs->iov_len == written) {
+ vecs ++;
+ written = 0;
+ }
+
+ if(eccbuf)
+ eccbuf += mtd->oobavail;
+ }
+ else
+ {
+ cnt = 0;
+ while(cnt < towrite ) {
+ data_buf[cnt++] = ((u_char *)
vecs->iov_base)[written++];
+ if(vecs->iov_len == written )
+ {
+ if((cnt+len) == total_len )
+ break;
+ vecs ++;
+ written = 0;
+ }
+ }
+ data_poi = data_buf;
+ ret = mtd->write_ecc(mtd, write_offset, cnt, &rl_wr,
data_poi, eccbuf, oobsel);
+ if (ret)
+ break;
+ len += rl_wr;
+ page ++;
+ write_offset = page * mtd->oobblock;
+ towrite = mtd->oobblock;
+
+ if(eccbuf)
+ eccbuf += mtd->oobavail;
+ }
+ }
+
+ if(retlen)
+ *retlen = len;
+ kfree(data_buf);
+
+ DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_writev_ecc()\n");
+
+ return ret;
+}
+
+
+static void
+stripe_erase_callback(struct erase_info *instr)
+{
+ wake_up((wait_queue_head_t *) instr->priv);
+}
+
+static int
+stripe_dev_erase(struct mtd_info *mtd, struct erase_info *erase)
+{
+ int err;
+ wait_queue_head_t waitq;
+ DECLARE_WAITQUEUE(wait, current);
+
+ init_waitqueue_head(&waitq);
+
+ erase->mtd = mtd;
+ erase->callback = stripe_erase_callback;
+ erase->priv = (unsigned long) &waitq;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_dev_erase(): addr=0x%08x,
len=%d\n", erase->addr, erase->len);
+
+ /*
+ * FIXME: Allow INTERRUPTIBLE. Which means
+ * not having the wait_queue head on the stack.
+ */
+ err = mtd->erase(mtd, erase);
+ if (!err)
+ {
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ add_wait_queue(&waitq, &wait);
+ if (erase->state != MTD_ERASE_DONE
+ && erase->state != MTD_ERASE_FAILED)
+ schedule();
+ remove_wait_queue(&waitq, &wait);
+ set_current_state(TASK_RUNNING);
+
+ err = (erase->state == MTD_ERASE_FAILED) ? -EIO : 0;
+ }
+ return err;
+}
+
+static int
+stripe_erase(struct mtd_info *mtd, struct erase_info *instr)
+{
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int i, err;
+ struct mtd_stripe_erase_bounds *erase_bounds;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to erase
(bytes) */
+ size_t subdev_len; /* data size to be erased at
this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left; /* total data size left to be
erased (bytes) */
+ size_t len_done; /* total data size erased */
+ u_int32_t from;
+
+ struct mtd_stripe_op *ops; /* operations array (one per
thread) */
+ u_int32_t size; /* amount of memory to be
allocated for thread operations */
+ u_int32_t queue_size;
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_earse(): addr=0x%08x, len=%d\n",
instr->addr, instr->len);
+
+ if(!(mtd->flags & MTD_WRITEABLE))
+ return -EROFS;
+
+ if(instr->addr > stripe->mtd.size)
+ return -EINVAL;
+
+ if(instr->len + instr->addr > stripe->mtd.size)
+ return -EINVAL;
+
+ /*
+ * Check for proper erase block alignment of the to-be-erased area.
+ */
+ if(!stripe->mtd.numeraseregions)
+ {
+ /* striped device has uniform erase block size */
+ /* NOTE: can't use if(instr->addr & (stripe->mtd.erasesize - 1))
alignment check here
+ * since stripe->mtd.erasesize can be not-power-of-two number */
+ if(instr->addr % stripe->mtd.erasesize || instr->len %
stripe->mtd.erasesize)
+ return -EINVAL;
+ }
+ else
+ {
+ /* we should not get here */
+ return -EINVAL;
+ }
+
+ instr->fail_addr = 0xffffffff;
+
+ /* allocate memory for multithread operations */
+ queue_size = 1; /* queue size for erase opration is 1 */
+ size = stripe->num_subdev *
SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
+ ops = kmalloc(size, GFP_KERNEL);
+ if(!ops)
+ {
+ printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
+ return -ENOMEM;
+ }
+
+ memset(ops, 0, size);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ ops[i].opcode = MTD_STRIPE_OPCODE_ERASE;
+ ops[i].caller_id = 0; /* TBD */
+ init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
to be unlocked by device thread */
+ //ops[i].status = 0; /* TBD */
+ ops[i].fail_addr = 0xffffffff;
+
+ INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
suboperation list head */
+
+ ops[i].subops.ops_num = 0; /* to be increased later
here */
+ ops[i].subops.ops_num_max = queue_size; /* total number of
suboperations can be stored in the array */
+ ops[i].subops.ops_array = (struct subop *)((char *)(ops +
stripe->num_subdev) + i * queue_size * sizeof(struct subop));
+ }
+
+ len_left = instr->len;
+ len_done = 0;
+ from = instr->addr;
+
+ /* allocate memory for erase boundaries for all subdevices */
+ erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
mtd_stripe_erase_bounds), GFP_KERNEL);
+ if(!erase_bounds)
+ {
+ kfree(ops);
+ return -ENOMEM;
+ }
+ memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
stripe->num_subdev);
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from - stripe->subdev_last_offset[i - 1])
/ stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from / stripe->interleave_size) / dev_count;
+ subdev_number = (from / stripe->interleave_size) % dev_count;
+ }
+
+ /* Should by optimized for erase op */
+ subdev_offset_low = from % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add/extend block-to-be erased */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset_low;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+ len_left -= subdev_len;
+ len_done += subdev_len;
+
+ if(from + len_done >= stripe->subdev_last_offset[stripe->num_subdev
- dev_count])
+ dev_count--;
+
+ while(len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size; /* can by optimized for erase op*/
+
+ /* Add/extend block-to-be erased */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset *
stripe->interleave_size;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+ len_left -= subdev_len;
+ len_done += subdev_len;
+
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_erase(): device = %d, addr =
0x%08x, len = %d\n", subdev_number, erase_bounds[subdev_number].addr,
erase_bounds[subdev_number].len);
+
+ if(from + len_done >=
stripe->subdev_last_offset[stripe->num_subdev - dev_count])
+ dev_count--;
+ }
+
+ /* now do the erase: */
+ err = 0;
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ if(erase_bounds[i].need_erase)
+ {
+ if (!(stripe->subdev[i]->flags & MTD_WRITEABLE))
+ {
+ err = -EROFS;
+ break;
+ }
+
+ stripe_add_subop(&ops[i], erase_bounds[i].addr,
erase_bounds[i].len, (u_char *)instr, NULL);
+ }
+ }
+
+ /* Push operation queues into the corresponding threads */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ if(erase_bounds[i].need_erase)
+ {
+ stripe_add_op(&stripe->sw_threads[i], &ops[i]);
+
+ /* set original operation priority */
+ ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
+ stripe_set_write_thread_prio(&stripe->sw_threads[i]);
+
+ up(&stripe->sw_threads[i].sw_thread_wait);
+ }
+ }
+
+ /* wait for all suboperations completed and check status */
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ if(erase_bounds[i].need_erase)
+ {
+ down(&ops[i].sem);
+
+ /* set error if one of operations has failed */
+ if(ops[i].status)
+ {
+ err = ops[i].status;
+
+ /* FIX ME: For now this adddres shows address
+ * at the last failed subdevice,
+ * but not at the "super" device */
+ if(ops[i].fail_addr != 0xffffffff)
+ instr->fail_addr = ops[i].fail_addr;
+ }
+
+ instr->state = ops[i].state;
+ }
+ }
+
+ /* Deallocate all memory before exit */
+ kfree(erase_bounds);
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ stripe_destroy_op(&ops[i]);
+ }
+ kfree(ops);
+
+ if(err)
+ return err;
+
+ if(instr->callback)
+ instr->callback(instr);
+ return 0;
+}
+
+static int
+stripe_lock(struct mtd_info *mtd, loff_t ofs, size_t len)
+{
+ u_int32_t ofs_loc = (u_int32_t)ofs; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to lock
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be locked @
subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to lock left
(bytes) */
+
+ size_t retlen = 0;
+ struct mtd_stripe_erase_bounds *erase_bounds;
+
+ /* Check whole striped device bounds here */
+ if(ofs_loc + len > mtd->size)
+ return err;
+
+ /* allocate memory for lock boundaries for all subdevices */
+ erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
mtd_stripe_erase_bounds), GFP_KERNEL);
+ if(!erase_bounds)
+ return -ENOMEM;
+ memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
stripe->num_subdev);
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(ofs_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((ofs_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((ofs_loc - stripe->subdev_last_offset[i
- 1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (ofs_loc / stripe->interleave_size) / dev_count;
+ subdev_number = (ofs_loc / stripe->interleave_size) % dev_count;
+ }
+
+ subdev_offset_low = ofs_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add/extend block-to-be locked */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset_low;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+
+ retlen += subdev_len;
+ len_left -= subdev_len;
+ if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+
+ while(len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Add/extend block-to-be locked */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset *
stripe->interleave_size;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+
+ retlen += subdev_len;
+ len_left -= subdev_len;
+
+ if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev
- dev_count])
+ dev_count--;
+ }
+
+ /* now do lock */
+ err = 0;
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ if(erase_bounds[i].need_erase)
+ {
+ if (stripe->subdev[i]->lock)
+ {
+ err = stripe->subdev[i]->lock(stripe->subdev[i],
erase_bounds[i].addr, erase_bounds[i].len);
+ if(err)
+ break;
+ };
+ }
+ }
+
+ /* Free allocated memory here */
+ kfree(erase_bounds);
+
+ return err;
+}
+
+static int
+stripe_unlock(struct mtd_info *mtd, loff_t ofs, size_t len)
+{
+ u_int32_t ofs_loc = (u_int32_t)ofs; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to unlock
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be unlocked @
subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = len; /* total data size to unlock
left (bytes) */
+
+ size_t retlen = 0;
+ struct mtd_stripe_erase_bounds *erase_bounds;
+
+ /* Check whole striped device bounds here */
+ if(ofs_loc + len > mtd->size)
+ return err;
+
+ /* allocate memory for unlock boundaries for all subdevices */
+ erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
mtd_stripe_erase_bounds), GFP_KERNEL);
+ if(!erase_bounds)
+ return -ENOMEM;
+ memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
stripe->num_subdev);
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(ofs_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((ofs_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((ofs_loc - stripe->subdev_last_offset[i
- 1]) / stripe->interleave_size) % dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (ofs_loc / stripe->interleave_size) / dev_count;
+ subdev_number = (ofs_loc / stripe->interleave_size) % dev_count;
+ }
+
+ subdev_offset_low = ofs_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* Add/extend block-to-be unlocked */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset_low;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+
+ retlen += subdev_len;
+ len_left -= subdev_len;
+ if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+
+ while(len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* Add/extend block-to-be unlocked */
+ if(!erase_bounds[subdev_number].need_erase)
+ {
+ erase_bounds[subdev_number].need_erase = 1;
+ erase_bounds[subdev_number].addr = subdev_offset *
stripe->interleave_size;
+ }
+ erase_bounds[subdev_number].len += subdev_len;
+
+ retlen += subdev_len;
+ len_left -= subdev_len;
+
+ if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev
- dev_count])
+ dev_count--;
+ }
+
+ /* now do unlock */
+ err = 0;
+ for(i = 0; i < stripe->num_subdev; i++)
+ {
+ if(erase_bounds[i].need_erase)
+ {
+ if (stripe->subdev[i]->unlock)
+ {
+ err = stripe->subdev[i]->unlock(stripe->subdev[i],
erase_bounds[i].addr, erase_bounds[i].len);
+ if(err)
+ break;
+ };
+ }
+ }
+
+ /* Free allocated memory here */
+ kfree(erase_bounds);
+
+ return err;
+}
+
+static void
+stripe_sync(struct mtd_info *mtd)
+{
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int i;
+
+ for (i = 0; i < stripe->num_subdev; i++)
+ {
+ struct mtd_info *subdev = stripe->subdev[i];
+ if (subdev->sync)
+ subdev->sync(subdev);
+ }
+}
+
+static int
+stripe_suspend(struct mtd_info *mtd)
+{
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int i, rc = 0;
+
+ for (i = 0; i < stripe->num_subdev; i++)
+ {
+ struct mtd_info *subdev = stripe->subdev[i];
+ if (subdev->suspend)
+ {
+ if ((rc = subdev->suspend(subdev)) < 0)
+ return rc;
+ };
+ }
+ return rc;
+}
+
+static void
+stripe_resume(struct mtd_info *mtd)
+{
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int i;
+
+ for (i = 0; i < stripe->num_subdev; i++)
+ {
+ struct mtd_info *subdev = stripe->subdev[i];
+ if (subdev->resume)
+ subdev->resume(subdev);
+ }
+}
+
+static int
+stripe_block_isbad(struct mtd_info *mtd, loff_t ofs)
+{
+ u_int32_t from_loc = (u_int32_t)ofs; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int res = 0;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = mtd->oobblock; /* total data size to read/write
left (bytes) */
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_block_isbad(): offset = 0x%08x\n",
from_loc);
+
+ from_loc = (from_loc / mtd->oobblock) * mtd->oobblock; /* align
offset here */
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* check block on subdevice is bad here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_isbad(): device = %d, offset
= 0x%08x\n", subdev_number, subdev_offset_low);
+ res =
stripe->subdev[subdev_number]->block_isbad(stripe->subdev[subdev_number]
, subdev_offset_low);
+ if(!res)
+ {
+ len_left -= subdev_len;
+ from_loc += subdev_len;
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ while(!res && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* check block on subdevice is bad here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_isbad(): device = %d,
offset = 0x%08x\n", subdev_number, subdev_offset *
stripe->interleave_size);
+ res =
stripe->subdev[subdev_number]->block_isbad(stripe->subdev[subdev_number]
, subdev_offset * stripe->interleave_size);
+ if(res)
+ {
+ break;
+ }
+ else
+ {
+ len_left -= subdev_len;
+ from_loc += subdev_len;
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev
- dev_count])
+ dev_count--;
+ }
+ }
+
+ DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_block_isbad()\n");
+ return res;
+}
+
+/* returns 0 - success */
+static int
+stripe_block_markbad(struct mtd_info *mtd, loff_t ofs)
+{
+ u_int32_t from_loc = (u_int32_t)ofs; /* we can do this since
whole MTD size in current implementation has u_int32_t type */
+
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int err = -EINVAL;
+ int i;
+
+ u_int32_t subdev_offset; /* equal size subdevs offset
(interleaved block size count)*/
+ u_int32_t subdev_number; /* number of current subdev */
+ u_int32_t subdev_offset_low; /* subdev offset to read/write
(bytes). used for "first" probably unaligned with erasesize data block
*/
+ size_t subdev_len; /* data size to be read/written
from/to subdev at this turn (bytes) */
+ int dev_count; /* equal size subdev count */
+ size_t len_left = mtd->oobblock; /* total data size to read/write
left (bytes) */
+
+ DEBUG(MTD_DEBUG_LEVEL2, "stripe_block_markbad(): offset =
0x%08x\n", from_loc);
+
+ from_loc = (from_loc / mtd->oobblock) * mtd->oobblock; /* align
offset here */
+
+ /* Locate start position and corresponding subdevice number */
+ subdev_offset = 0;
+ subdev_number = 0;
+ dev_count = stripe->num_subdev;
+ for(i = (stripe->num_subdev - 1); i > 0; i--)
+ {
+ if(from_loc >= stripe->subdev_last_offset[i-1])
+ {
+ dev_count = stripe->num_subdev - i; /* get "equal size"
devices count */
+ subdev_offset = stripe->subdev[i - 1]->size /
stripe->interleave_size - 1;
+ subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
1]) / stripe->interleave_size) / dev_count;
+ subdev_number = i + ((from_loc -
stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
dev_count;
+ break;
+ }
+ }
+
+ if(subdev_offset == 0)
+ {
+ subdev_offset = (from_loc / stripe->interleave_size) /
dev_count;
+ subdev_number = (from_loc / stripe->interleave_size) %
dev_count;
+ }
+
+ subdev_offset_low = from_loc % stripe->interleave_size;
+ subdev_len = (len_left < (stripe->interleave_size -
subdev_offset_low)) ? len_left : (stripe->interleave_size -
subdev_offset_low);
+ subdev_offset_low += subdev_offset * stripe->interleave_size;
+
+ /* check block on subdevice is bad here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_markbad(): device = %d,
offset = 0x%08x\n", subdev_number, subdev_offset_low);
+ err =
stripe->subdev[subdev_number]->block_markbad(stripe->subdev[subdev_numbe
r], subdev_offset_low);
+ if(!err)
+ {
+ len_left -= subdev_len;
+ from_loc += subdev_len;
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
dev_count])
+ dev_count--;
+ }
+
+ while(!err && len_left > 0 && dev_count > 0)
+ {
+ subdev_number++;
+ if(subdev_number >= stripe->num_subdev)
+ {
+ subdev_number = stripe->num_subdev - dev_count;
+ subdev_offset++;
+ }
+ subdev_len = (len_left < stripe->interleave_size) ? len_left :
stripe->interleave_size;
+
+ /* check block on subdevice is bad here */
+ DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_markbad(): device = %d,
offset = 0x%08x\n", subdev_number, subdev_offset *
stripe->interleave_size);
+ err =
stripe->subdev[subdev_number]->block_markbad(stripe->subdev[subdev_numbe
r], subdev_offset * stripe->interleave_size);
+ if(err)
+ {
+ break;
+ }
+ else
+ {
+ len_left -= subdev_len;
+ from_loc += subdev_len;
+ if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev
- dev_count])
+ dev_count--;
+ }
+ }
+
+ DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_block_markbad()\n");
+ return err;
+}
+
+/*
+ * This function constructs a virtual MTD device by interleaving
(striping)
+ * num_devs MTD devices. A pointer to the new device object is
+ * stored to *new_dev upon success. This function does _not_
+ * register any devices: this is the caller's responsibility.
+ */
+struct mtd_info *mtd_stripe_create(struct mtd_info *subdev[], /*
subdevices to stripe */
+ int num_devs, /*
number of subdevices */
+ char *name, /* name
for the new device */
+ int interleave_size) /*
interleaving size (sanity check is required) */
+{
+ int i,j;
+ size_t size;
+ struct mtd_stripe *stripe;
+ u_int32_t curr_erasesize;
+ int sort_done = 0;
+
+ printk(KERN_NOTICE "Striping MTD devices:\n");
+ for (i = 0; i < num_devs; i++)
+ printk(KERN_NOTICE "(%d): \"%s\"\n", i, subdev[i]->name);
+ printk(KERN_NOTICE "into device \"%s\"\n", name);
+
+ /* check if trying to stripe same device */
+ for(i = 0; i < num_devs; i++)
+ {
+ for(j = i; j < num_devs; j++)
+ {
+ if(i != j && !(strcmp(subdev[i]->name,subdev[j]->name)))
+ {
+ printk(KERN_ERR "MTD Stripe failed. The same subdevice
names were found.\n");
+ return NULL;
+ }
+ }
+ }
+
+ /* allocate the device structure */
+ size = SIZEOF_STRUCT_MTD_STRIPE(num_devs);
+ stripe = kmalloc(size, GFP_KERNEL);
+ if (!stripe)
+ {
+ printk(KERN_ERR "mtd_stripe_create(): memory allocation
error\n");
+ return NULL;
+ }
+ memset(stripe, 0, size);
+ stripe->subdev = (struct mtd_info **) (stripe + 1);
+ stripe->subdev_last_offset = (u_int32_t *) ((char *)(stripe + 1) +
num_devs * sizeof(struct mtd_info *));
+ stripe->sw_threads = (struct mtd_sw_thread_info *)((char *)(stripe
+ 1) + num_devs * sizeof(struct mtd_info *) + num_devs *
sizeof(u_int32_t));
+
+ /*
+ * Set up the new "super" device's MTD object structure, check for
+ * incompatibilites between the subdevices.
+ */
+ stripe->mtd.type = subdev[0]->type;
+ stripe->mtd.flags = subdev[0]->flags;
+ stripe->mtd.size = subdev[0]->size;
+ stripe->mtd.erasesize = subdev[0]->erasesize;
+ stripe->mtd.oobblock = subdev[0]->oobblock;
+ stripe->mtd.oobsize = subdev[0]->oobsize;
+ stripe->mtd.oobavail = subdev[0]->oobavail;
+ stripe->mtd.ecctype = subdev[0]->ecctype;
+ stripe->mtd.eccsize = subdev[0]->eccsize;
+ if (subdev[0]->read_ecc)
+ stripe->mtd.read_ecc = stripe_read_ecc;
+ if (subdev[0]->write_ecc)
+ stripe->mtd.write_ecc = stripe_write_ecc;
+ if (subdev[0]->read_oob)
+ stripe->mtd.read_oob = stripe_read_oob;
+ if (subdev[0]->write_oob)
+ stripe->mtd.write_oob = stripe_write_oob;
+
+ stripe->subdev[0] = subdev[0];
+
+ for(i = 1; i < num_devs; i++)
+ {
+ /*
+ * Check device compatibility,
+ */
+ if(stripe->mtd.type != subdev[i]->type)
+ {
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): incompatible device
type on \"%s\"\n",
+ subdev[i]->name);
+ return NULL;
+ }
+
+ /*
+ * Check MTD flags
+ */
+ if(stripe->mtd.flags != subdev[i]->flags)
+ {
+ /*
+ * Expect all flags to be
+ * equal on all subdevices.
+ */
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): incompatible device
flags on \"%s\"\n",
+ subdev[i]->name);
+ return NULL;
+ }
+
+ stripe->mtd.size += subdev[i]->size;
+
+ /*
+ * Check OOB and ECC data
+ */
+ if (stripe->mtd.oobblock != subdev[i]->oobblock ||
+ stripe->mtd.oobsize != subdev[i]->oobsize ||
+ stripe->mtd.oobavail != subdev[i]->oobavail ||
+ stripe->mtd.ecctype != subdev[i]->ecctype ||
+ stripe->mtd.eccsize != subdev[i]->eccsize ||
+ !stripe->mtd.read_ecc != !subdev[i]->read_ecc ||
+ !stripe->mtd.write_ecc != !subdev[i]->write_ecc ||
+ !stripe->mtd.read_oob != !subdev[i]->read_oob ||
+ !stripe->mtd.write_oob != !subdev[i]->write_oob)
+ {
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): incompatible OOB or
ECC data on \"%s\"\n",
+ subdev[i]->name);
+ return NULL;
+ }
+ stripe->subdev[i] = subdev[i];
+ }
+
+ stripe->num_subdev = num_devs;
+ stripe->mtd.name = name;
+
+ /*
+ * Main MTD routines
+ */
+ stripe->mtd.erase = stripe_erase;
+ stripe->mtd.read = stripe_read;
+ stripe->mtd.write = stripe_write;
+ stripe->mtd.sync = stripe_sync;
+ stripe->mtd.lock = stripe_lock;
+ stripe->mtd.unlock = stripe_unlock;
+ stripe->mtd.suspend = stripe_suspend;
+ stripe->mtd.resume = stripe_resume;
+
+#ifdef MTD_PROGRAM_REGIONS
+ /* Montavista patch for Sibley support detected */
+ if((stripe->mtd.flags & MTD_PROGRAM_REGIONS) ||
(stripe->mtd.flags & MTD_ECC))
+ stripe->mtd.writev = stripe_writev;
+#else
+ if(stripe->mtd.flags & MTD_ECC)
+ stripe->mtd.writev = stripe_writev;
+#endif
+
+ /* not sure about that case. probably should be used not only for
NAND */
+ if(stripe->mtd.type == MTD_NANDFLASH)
+ stripe->mtd.writev_ecc = stripe_writev_ecc;
+
+ if(subdev[0]->block_isbad)
+ stripe->mtd.block_isbad = stripe_block_isbad;
+
+ if(subdev[0]->block_markbad)
+ stripe->mtd.block_markbad = stripe_block_markbad;
+
+ /* NAND specific */
+ if(stripe->mtd.type == MTD_NANDFLASH)
+ {
+ stripe->mtd.oobblock *= num_devs;
+ stripe->mtd.oobsize *= num_devs;
+ stripe->mtd.oobavail *= num_devs; /* oobavail is to be changed
later in stripe_merge_oobinfo() */
+ stripe->mtd.eccsize *= num_devs;
+ }
+
+#ifdef MTD_PROGRAM_REGIONS
+ /* Montavista patch for Sibley support detected */
+ if(stripe->mtd.flags & MTD_PROGRAM_REGIONS)
+ stripe->mtd.oobblock *= num_devs;
+ else if(stripe->mtd.flags & MTD_ECC)
+ stripe->mtd.eccsize *= num_devs;
+#else
+ if(stripe->mtd.flags & MTD_ECC)
+ stripe->mtd.eccsize *= num_devs;
+#endif
+
+ /* Sort all subdevices by their size (from largest to smallest)*/
+ while(!sort_done)
+ {
+ sort_done = 1;
+ for(i=0; i < num_devs - 1; i++)
+ {
+ struct mtd_info *subdev = stripe->subdev[i];
+ if(subdev->size > stripe->subdev[i+1]->size)
+ {
+ stripe->subdev[i] = stripe->subdev[i+1];
+ stripe->subdev[i+1] = subdev;
+ sort_done = 0;
+ }
+ }
+ }
+
+ /* Create new device with uniform erase size */
+ curr_erasesize = subdev[0]->erasesize;
+ for (i = 1; i < num_devs; i++)
+ {
+ curr_erasesize = lcm(curr_erasesize, subdev[i]->erasesize);
+ }
+ curr_erasesize *= num_devs;
+
+ /* Check if there are different size devices in the array*/
+ for (i = 1; i < num_devs; i++)
+ {
+ /* note: subdevices must be already sorted by their size here */
+ if(subdev[i - 1]->size > subdev[i]->size)
+ {
+ u_int32_t tmp_erasesize = subdev[i]->erasesize;
+ for(j = 0; j < i; j++)
+ {
+ tmp_erasesize = lcm(tmp_erasesize,
subdev[j]->erasesize);
+ }
+ tmp_erasesize *= i;
+ curr_erasesize = lcm(curr_erasesize, tmp_erasesize);
+ }
+ }
+
+ /* Check if erase size found is valid */
+ if(curr_erasesize <= 0)
+ {
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): Can't find lcm of
subdevice erase sizes\n");
+ return NULL;
+ }
+
+ /* Check interleave size validity here */
+ if(curr_erasesize % interleave_size)
+ {
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): Wrong interleave size\n");
+ return NULL;
+ }
+ stripe->interleave_size = interleave_size;
+
+ stripe->mtd.erasesize = curr_erasesize;
+ stripe->mtd.numeraseregions = 0;
+
+ /* update (truncate) super device size in accordance with new
erasesize */
+ stripe->mtd.size = (stripe->mtd.size / stripe->mtd.erasesize) *
stripe->mtd.erasesize;
+
+ /* Calculate last data offset for each striped device */
+ for (i = 0; i < num_devs; i++)
+ stripe->subdev_last_offset[i] = last_offset(stripe, i);
+
+ /* NAND specific */
+ if(stripe->mtd.type == MTD_NANDFLASH)
+ {
+ /* Fill oobavail with correct values here */
+ for (i = 0; i < num_devs; i++)
+ stripe->subdev[i]->oobavail =
stripe_get_oobavail(stripe->subdev[i]);
+
+ /* Sets new device oobinfo
+ * NAND flash check is performed inside stripe_merge_oobinfo()
+ * - this should be made after subdevices sorting done for
proper eccpos and oobfree positioning
+ * NOTE: there are some limitations with different size NAND
devices striping. all devices must have
+ * the same oobfree and eccpos maps */
+ if(stripe_merge_oobinfo(&stripe->mtd, subdev, num_devs))
+ {
+ kfree(stripe);
+ printk(KERN_ERR "mtd_stripe_create(): oobinfo merge has
failed\n");
+ return NULL;
+ }
+ }
+
+ /* Create worker threads */
+ for (i = 0; i < num_devs; i++)
+ {
+ if(stripe_start_write_thread(&stripe->sw_threads[i],
stripe->subdev[i]) < 0)
+ {
+ kfree(stripe);
+ return NULL;
+ }
+ }
+
+ return &stripe->mtd;
+}
+
+/*
+ * This function destroys an Striped MTD object
+ */
+void mtd_stripe_destroy(struct mtd_info *mtd)
+{
+ struct mtd_stripe *stripe = STRIPE(mtd);
+ int i;
+
+ if (stripe->mtd.numeraseregions)
+ /* we should not get here. so just in case. */
+ kfree(stripe->mtd.eraseregions);
+
+ /* destroy writing threads */
+ for (i = 0; i < stripe->num_subdev; i++)
+ stripe_stop_write_thread(&stripe->sw_threads[i]);
+
+ kfree(stripe);
+}
+
+
+#ifdef CMDLINE_PARSER_STRIPE
+/*
+ * MTD stripe init and cmdline parsing routines
+ */
+
+static int
+parse_cmdline_stripe_part(struct mtd_stripe_info *info, char *s)
+{
+ int ret = 0;
+
+ struct mtd_stripe_info *new_stripe = NULL;
+ unsigned int name_size;
+ char *subdev_name;
+ char *e;
+ int j;
+
+ DEBUG(MTD_DEBUG_LEVEL1, "parse_cmdline_stripe_part(): arg = %s\n",
s);
+
+ /* parse new striped device name and allocate stripe info structure
*/
+ if(!(e = strchr(s,'(')) || (e == s))
+ return -EINVAL;
+
+ name_size = (unsigned int)(e - s);
+ new_stripe = kmalloc(sizeof(struct mtd_stripe_info) + name_size +
1, GFP_KERNEL);
+ if(!new_stripe) {
+ printk(KERN_ERR "parse_cmdline_stripe_part(): memory allocation
error!\n");
+ return -ENOMEM;
+ }
+ memset(new_stripe,0,sizeof(struct mtd_stripe_info) + name_size +
1);
+ new_stripe->name = (char *)(new_stripe + 1);
+
+ INIT_LIST_HEAD(&new_stripe->list);
+
+ /* Store new device name */
+ strncpy(new_stripe->name, s, name_size);
+ s = e;
+
+ while(*s != 0)
+ {
+ switch(*s)
+ {
+ case '(':
+ s++;
+ new_stripe->interleave_size = simple_strtoul(s,&s,10);
+ if(!new_stripe->interleave_size || *s != ')')
+ ret = -EINVAL;
+ else
+ s++;
+ break;
+ case ':':
+ case ',':
+ case '.':
+#ifdef MODULE
+ if(bynumber)
+ {
+ s++;
+ j = simple_strtoul(s,&s,10);
+ if(j < MAX_MTD_DEVICES)
+ {
+ /* Set up and register striped MTD device */
+ down(&mtd_table_mutex);
+ new_stripe->devs[new_stripe->dev_num++] =
mtd_table[j];
+ up(&mtd_table_mutex);
+ }
+ else
+ {
+ ret = -EINVAL;
+ }
+ break;
+ }
+#endif
+
+ // proceed with subdevice names
+ if((e = strchr(++s,',')))
+ name_size = (unsigned int)(e - s);
+ else if((e = strchr(s,'.'))) /* this delimeter is to
be used for insmod params */
+ name_size = (unsigned int)(e - s);
+ else
+ name_size = strlen(s);
+
+ subdev_name = kmalloc(name_size + 1, GFP_KERNEL);
+ if(!subdev_name)
+ {
+ printk(KERN_ERR "parse_cmdline_stripe_part(): memory
allocation error!\n");
+ ret = -ENOMEM;
+ break;
+ }
+ strncpy(subdev_name,s,name_size);
+ *(subdev_name + name_size) = 0;
+
+ /* Set up and register striped MTD device */
+ down(&mtd_table_mutex);
+ for(j = 0; j < MAX_MTD_DEVICES; j++)
+ {
+ if(mtd_table[j] &&
!strcmp(subdev_name,mtd_table[j]->name))
+ {
+ new_stripe->devs[new_stripe->dev_num++] =
mtd_table[j];
+ break;
+ }
+ }
+ up(&mtd_table_mutex);
+
+ kfree(subdev_name);
+
+ if(j == MAX_MTD_DEVICES)
+ ret = -EINVAL;
+
+ s += name_size;
+
+ break;
+ default:
+ /* should not get here */
+ printk(KERN_ERR "stripe cmdline parse error\n");
+ ret = -EINVAL;
+ break;
+ };
+
+ if(ret)
+ break;
+ }
+
+ /* Check if all data parsed correctly. Sanity check. */
+ if(ret)
+ {
+ kfree(new_stripe);
+ }
+ else
+ {
+ list_add_tail(&new_stripe->list,&info->list);
+ DEBUG(MTD_DEBUG_LEVEL1, "Striped device %s parsed from
cmdline\n", new_stripe->name);
+ }
+
+ return ret;
+}
+
+/* cmdline format:
+ * mtdstripe=stripe1(128):vol3,vol5;stripe2(128):vol8,vol9 */
+static int
+parse_cmdline_stripes(struct mtd_stripe_info *info, char *s)
+{
+ int ret = 0;
+ char *part;
+ char *e;
+ int cmdline_part_size;
+
+ struct list_head *pos, *q;
+ struct mtd_stripe_info *stripe_info;
+
+ while(*s)
+ {
+ if(!(e = strchr(s,';')))
+ {
+ ret = parse_cmdline_stripe_part(info,s);
+ break;
+ }
+ else
+ {
+ cmdline_part_size = (int)(e - s);
+ part = kmalloc(cmdline_part_size + 1, GFP_KERNEL);
+ if(!part)
+ {
+ printk(KERN_ERR "parse_cmdline_stripes(): memory
allocation error!\n");
+ ret = -ENOMEM;
+ break;
+ }
+ strncpy(part,s,cmdline_part_size);
+ *(part + cmdline_part_size) = 0;
+ ret = parse_cmdline_stripe_part(info,part);
+ kfree(part);
+ if(ret)
+ break;
+ s = e + 1;
+ }
+ }
+
+ if(ret)
+ {
+ /* free all alocated memory in case of error */
+ list_for_each_safe(pos, q, &info->list) {
+ stripe_info = list_entry(pos, struct mtd_stripe_info, list);
+ list_del(&stripe_info->list);
+ kfree(stripe_info);
+ }
+ }
+
+ return ret;
+}
+
+/* initializes striped MTD devices
+ * to be called from mphysmap.c module or mtdstripe_init()
+ */
+int
+mtd_stripe_init(void)
+{
+ static struct mtd_stripe_info *dev_info;
+ struct list_head *pos, *q;
+
+ struct mtd_info* mtdstripe_info;
+
+ INIT_LIST_HEAD(&info.list);
+
+ /* parse cmdline */
+ if(!cmdline)
+ return 0;
+
+ if(parse_cmdline_stripes(&info,cmdline))
+ return -EINVAL;
+
+ /* go through the list and create new striped devices */
+ list_for_each_safe(pos, q, &info.list) {
+ dev_info = list_entry(pos, struct mtd_stripe_info, list);
+
+ mtdstripe_info = mtd_stripe_create(dev_info->devs,
dev_info->dev_num,
+ dev_info->name,
dev_info->interleave_size);
+ if(!mtdstripe_info)
+ {
+ printk(KERN_ERR "mtd_stripe_init: mtd_stripe_create() error
creating \"%s\"\n", dev_info->name);
+
+ /* remove registered striped device info from the list
+ * free memory allocated by parse_cmdline_stripes()
+ */
+ list_del(&dev_info->list);
+ kfree(dev_info);
+
+ return -EINVAL;
+ }
+ else
+ {
+ if(add_mtd_device(mtdstripe_info))
+ {
+ printk(KERN_ERR "mtd_stripe_init: add_mtd_device() error
creating \"%s\"\n", dev_info->name);
+ mtd_stripe_destroy(mtdstripe_info);
+
+ /* remove registered striped device info from the list
+ * free memory allocated by parse_cmdline_stripes()
+ */
+ list_del(&dev_info->list);
+ kfree(dev_info);
+
+ return -EINVAL;
+ }
+ else
+ printk(KERN_ERR "Striped device \"%s\" has been created
(interleave size %d bytes)\n",
+ dev_info->name, dev_info->interleave_size);
+ }
+ }
+
+ return 0;
+}
+
+/* removes striped devices */
+int
+mtd_stripe_exit(void)
+{
+ static struct mtd_stripe_info *dev_info;
+ struct list_head *pos, *q;
+ struct mtd_info *old_mtd_info;
+
+ int j;
+
+ /* go through the list and remove striped devices */
+ list_for_each_safe(pos, q, &info.list) {
+ dev_info = list_entry(pos, struct mtd_stripe_info, list);
+
+ down(&mtd_table_mutex);
+ for(j = 0; j < MAX_MTD_DEVICES; j++)
+ {
+ if(mtd_table[j] &&
!strcmp(dev_info->name,mtd_table[j]->name))
+ {
+ old_mtd_info = mtd_table[j];
+ up(&mtd_table_mutex); /* up here since del_mtd_device
down it */
+ del_mtd_device(mtd_table[j]);
+ down(&mtd_table_mutex);
+ mtd_stripe_destroy(old_mtd_info);
+ break;
+ }
+ }
+ up(&mtd_table_mutex);
+
+ /* remove registered striped device info from the list
+ * free memory allocated by parse_cmdline_stripes()
+ */
+ list_del(&dev_info->list);
+ kfree(dev_info);
+ }
+
+ return 0;
+}
+
+EXPORT_SYMBOL(mtd_stripe_init);
+EXPORT_SYMBOL(mtd_stripe_exit);
+#endif
+
+#ifdef CONFIG_MTD_CMDLINE_STRIPE
+#ifndef MODULE
+/*
+ * This is the handler for our kernel parameter, called from
+ * main.c::checksetup(). Note that we can not yet kmalloc() anything,
+ * so we only save the commandline for later processing.
+ *
+ * This function needs to be visible for bootloaders.
+ */
+int mtdstripe_setup(char *s)
+{
+ cmdline = s;
+ return 1;
+}
+
+__setup("mtdstripe=", mtdstripe_setup);
+#endif
+#endif
+
+EXPORT_SYMBOL(mtd_stripe_create);
+EXPORT_SYMBOL(mtd_stripe_destroy);
+
+#ifdef MODULE
+static int __init init_mtdstripe(void)
+{
+ if(byname)
+ cmdline = byname;
+ else if(bynumber)
+ cmdline = bynumber;
+
+ if(cmdline)
+ mtd_stripe_init();
+
+ return 0;
+}
+
+static void __exit exit_mtdstripe(void)
+{
+ if(cmdline)
+ mtd_stripe_exit();
+}
+
+module_init(init_mtdstripe);
+module_exit(exit_mtdstripe);
+#endif
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Alexander Belyakov <alexander.belyakov@intel.com>, Intel
Corporation");
+MODULE_DESCRIPTION("Generic support for striping of MTD devices");
diff -uNr a/include/linux/mtd/stripe.h b/include/linux/mtd/stripe.h
--- a/include/linux/mtd/stripe.h 1970-01-01 03:00:00.000000000
+0300
+++ b/include/linux/mtd/stripe.h 2006-03-28 12:10:48.000000000
+0400
@@ -0,0 +1,39 @@
+/*
+ * MTD device striping layer definitions
+ *
+ * (C) 2005 Intel Corp.
+ *
+ * This code is GPL
+ *
+ *
+ */
+
+#ifndef MTD_STRIPE_H
+#define MTD_STRIPE_H
+
+struct mtd_stripe_info {
+ struct list_head list;
+ char *name; /* new device
name */
+ int interleave_size; /* interleave size */
+ int dev_num; /* number of devices to
be striped */
+ struct mtd_info* devs[MAX_MTD_DEVICES]; /* MTD device to be
striped */
+};
+
+struct mtd_info *mtd_stripe_create(
+ struct mtd_info *subdev[], /* subdevices to stripe */
+ int num_devs, /* number of subdevices */
+ char *name, /* name for the new device */
+ int inteleave_size); /* interleaving size */
+
+
+struct mtd_info *mtd_stripe_create(struct mtd_info *subdev[], /*
subdevices to stripe */
+ int num_devs, /*
number of subdevices */
+ char *name, /* name
for the new device */
+ int interleave_size); /*
interleaving size (sanity check is required) */
+void mtd_stripe_destroy(struct mtd_info *mtd);
+
+int mtd_stripe_init(void);
+int mtd_stripe_exit(void);
+
+#endif
+
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 7:57 [PATCH/RFC] MTD: Striping layer core Belyakov, Alexander
@ 2006-03-30 9:06 ` Vitaly Wool
2006-03-30 11:50 ` Artem B. Bityutskiy
2006-03-30 15:24 ` Alexander Belyakov
2006-03-30 10:35 ` Artem B. Bityutskiy
2006-03-30 12:11 ` Jörn Engel
2 siblings, 2 replies; 65+ messages in thread
From: Vitaly Wool @ 2006-03-30 9:06 UTC (permalink / raw)
To: Belyakov, Alexander; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
Hi Alexander,
Belyakov, Alexander wrote:
> One may say that striping is quite similar to already existing in MTD
> concatenation layer. That is not true since these layers have some sharp
> distinctions. The first one is the purpose. Concatenation only purpose
> is to make larger device from several smaller devices. Striping purpose
> is to make devices operate faster. Next difference is provided access to
> sub-devices. Concatenation layer provides linear access to sub-devices.
> Striping provides interleaved access to sub-devices.
>
Still it's unclear why not to provide a configurable extension to
mtdconcat rather than create a new layer.
> Simultaneous operation means separate threads. Each independent chip
> which participates in creation of striped volume has its own worker
> thread. Worker threads are created at the stage of striped device
> initialization. Each worker thread has its own operation queue and
> interleaving algorithm feeds them. Worker threads interact with flash
> drivers (CFI, NAND subsystem).
>
Sooo many threads... :(
>
> 3. POSSIBLE CONFIGURATIONS AND LIMITATIONS
> It is possible to stripe devices of the same type. We can't stripe NOR
> and NAND, but only NOR and NOR or NAND and NAND. Flashes of the same
> type can differ in erase size and total size.
>
Why is that? Being able to deal only with flash chips of the same type,
your approach has very limited applicability (probably limited to almost
only Intel platforms ;))
And, well, just having looked through the patch, I'd like to point out
multiple #ifdef's in C code and multiple whitespace problems.
> @@ -155,6 +158,15 @@
> };
> };
> up(&map_mutex);
> +
> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
> +#ifndef MODULE
> + if(mtd_stripe_init()) {
> + printk(KERN_WARNING "MTD stripe initialization from cmdline
> has failed\n");
> + }
> +#endif
> +#endif
>
Bah, what's going on here?
> +/* Operation codes */
> +#define MTD_STRIPE_OPCODE_READ 0x1
> +#define MTD_STRIPE_OPCODE_WRITE 0x2
> +#define MTD_STRIPE_OPCODE_READ_ECC 0x3
> +#define MTD_STRIPE_OPCODE_WRITE_ECC 0x4
> +#define MTD_STRIPE_OPCODE_WRITE_OOB 0x5
> +#define MTD_STRIPE_OPCODE_ERASE 0x6
>
You don't need READ_OOB, eh?
> +/*
> + * Miscelaneus support routines
> + */
> +
> +/*
> + * searches for least common multiple of a and b
> + * returns: LCM or 0 in case of error
> + */
> +u_int32_t
> +lcm(u_int32_t a, u_int32_t b)
> +{
> + u_int32_t lcm;
> + u_int32_t t1 = a;
> + u_int32_t t2 = b;
> +
> + if(a <= 0 || b <= 0)
> + {
> + lcm = 0;
> + printk(KERN_ERR "lcm(): wrong arguments\n");
> + }
> + else if(a == b)
> + {
> + /* trivial case */
> + lcm = a;
> + }
> + else
> + {
> + do
> + {
> + lcm = a;
> + a = b;
> + b = lcm - a*(lcm/a);
> + }
> + while(b!=0);
> +
> + if(t1 % a)
> + lcm = (t2 / a) * t1;
> + else
> + lcm = (t1 / a) * t2;
> + }
> +
> + return lcm;
> +} /* int lcm(int a, int b) */
> +
> +u_int32_t last_offset(struct mtd_stripe *stripe, int subdev_num);
> +
> +/*
> + * Calculates last_offset for specific striped subdevice
> + * NOTE: subdev array MUST be sorted
> + * by subdevice size (from the smallest to the largest)
> + */
> +u_int32_t
> +last_offset(struct mtd_stripe *stripe, int subdev_num)
>
Aint this one and stuff alike gonna be static?
> +int stripe_merge_oobinfo(struct mtd_info *mtd, struct mtd_info
> *subdev[], int num_devs)
> +{
> + int ret = 0;
> + int i, j;
> + uint32_t eccpos_max_num = sizeof(mtd->oobinfo.eccpos) /
> sizeof(uint32_t);
> + uint32_t eccpos_counter = 0;
> + uint32_t oobfree_max_num = 8; /* array size defined in mtd-abi.h */
> + uint32_t oobfree_counter = 0;
> +
> + if(mtd->type != MTD_NANDFLASH)
> + return 0;
> +
> + mtd->oobinfo.useecc = subdev[0]->oobinfo.useecc;
> + mtd->oobinfo.eccbytes = subdev[0]->oobinfo.eccbytes;
> + for(i = 1; i < num_devs; i++)
> + {
> + if(mtd->oobinfo.useecc != subdev[i]->oobinfo.useecc ||
> + mtd->oobinfo.eccbytes != subdev[i]->oobinfo.eccbytes)
> + {
> + printk(KERN_ERR "stripe_merge_oobinfo(): oobinfo parameters
> is not compatible for all subdevices\n");
> + return -EINVAL;
> + }
> + }
>
I guess this is a limitation that is not mentioned anywhere.
> +
> + mtd->oobinfo.eccbytes *= num_devs;
> +
> + /* drop old oobavail value */
> + mtd->oobavail = 0;
> +
> + /* merge oobfree space positions */
> + for(i = 0; i < num_devs; i++)
> + {
> + for(j = 0; j < oobfree_max_num; j++)
> + {
> + if(subdev[i]->oobinfo.oobfree[j][1])
> + {
> + if(oobfree_counter >= oobfree_max_num)
> + break;
> +
> + mtd->oobinfo.oobfree[oobfree_counter][0] =
> subdev[i]->oobinfo.oobfree[j][0] +
> + i *
> subdev[i]->oobsize;
> + mtd->oobinfo.oobfree[oobfree_counter][1] =
> subdev[i]->oobinfo.oobfree[j][1];
> +
> + mtd->oobavail += subdev[i]->oobinfo.oobfree[j][1];
> + oobfree_counter++;
> + }
> + }
> + }
> +
> + /* merge ecc positions */
> + for(i = 0; i < num_devs; i++)
> + {
> + for(j = 0; j < eccpos_max_num; j++)
> + {
> + if(subdev[i]->oobinfo.eccpos[j])
> + {
> + if(eccpos_counter >= eccpos_max_num)
> + {
> + printk(KERN_ERR "stripe_merge_oobinfo(): eccpos
> merge error\n");
> + return -EINVAL;
> + }
> +
> mtd->oobinfo.eccpos[eccpos_counter]=subdev[i]->oobinfo.eccpos[j] + i *
> subdev[i]->oobsize;
> + eccpos_counter++;
> + }
> + }
> + }
> +
> + return ret;
> +}
> +
> +/* End of support routines */
> +
> +/* Multithreading support routines */
> +
> +/* Write to flash thread */
> +static void
> +stripe_write_thread(void *arg)
> +{
> + struct mtd_sw_thread_info* info = (struct mtd_sw_thread_info*)arg;
> + struct mtd_stripe_op* op;
> + struct subop_struct* subops;
> + u_int32_t retsize;
> + int err;
> +
> + int i;
> + struct list_head *pos;
> +
> + /* erase operation stuff */
> + struct erase_info erase; /* local copy */
> + struct erase_info *instr; /* pointer to original */
> +
> + info->thread = current;
> + up(&info->sw_thread_startstop);
> +
> + while(info->sw_thread)
> + {
> + /* wait for downcoming write/erase operation */
> + down(&info->sw_thread_wait);
> +
> + /* issue operation to the device and remove it from the list
> afterwards*/
> + spin_lock(&info->list_lock);
> + if(!list_empty(&info->list))
> + {
> + op = list_entry(info->list.next,struct mtd_stripe_op, list);
> + }
> + else
> + {
> + /* no operation in queue but sw_thread_wait has been rised.
> + * it means stripe_stop_write_thread() has been called
> + */
> + op = NULL;
> + }
> + spin_unlock(&info->list_lock);
> +
> + /* leave main thread loop if no ops */
> + if(!op)
> + break;
> +
> + err = 0;
> + op->status = 0;
> +
> + switch(op->opcode)
> + {
> + case MTD_STRIPE_OPCODE_WRITE:
> + case MTD_STRIPE_OPCODE_WRITE_OOB:
> + /* proceed with list head first */
> + subops = &op->subops;
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + if(op->opcode == MTD_STRIPE_OPCODE_WRITE)
> + err = info->subdev->write(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
> subops->ops_array[i].buf);
> + else
> + err = info->subdev->write_oob(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
> subops->ops_array[i].buf);
> +
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: write operation
> failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(!op->status)
> + {
> + /* now proceed each list element except head */
> + list_for_each(pos, &op->subops.list)
> + {
> + subops = list_entry(pos, struct subop_struct,
> list);
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + if(op->opcode == MTD_STRIPE_OPCODE_WRITE)
> + err = info->subdev->write(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len, &retsize,
> subops->ops_array[i].buf);
> + else
> + err =
> info->subdev->write_oob(info->subdev, subops->ops_array[i].ofs,
> subops->ops_array[i].len, &retsize, subops->ops_array[i].buf);
> +
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: write
> operation failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(op->status)
> + break;
> + }
> + }
> + break;
> +
> + case MTD_STRIPE_OPCODE_ERASE:
> + subops = &op->subops;
> + instr = (struct erase_info *)subops->ops_array[0].buf;
> +
> + /* make a local copy of original erase instruction to
> avoid modifying the caller's struct */
> + erase = *instr;
> + erase.addr = subops->ops_array[0].ofs;
> + erase.len = subops->ops_array[0].len;
> +
> + if ((err = stripe_dev_erase(info->subdev, &erase)))
> + {
> + /* sanity check: should never happen since
> + * block alignment has been checked early in
> stripe_erase() */
> +
> + if(erase.fail_addr != 0xffffffff)
> + /* For now this adddres shows address
> + * at failed subdevice,but not at "super" device
> */
> + op->fail_addr = erase.fail_addr;
> + }
> +
> + op->status = err;
> + op->state = erase.state;
> + break;
> +
> + case MTD_STRIPE_OPCODE_WRITE_ECC:
> + /* proceed with list head first */
> + subops = &op->subops;
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + err = info->subdev->write_ecc(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf,
> +
> subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: write operation
> failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(!op->status)
> + {
> + /* now proceed each list element except head */
> + list_for_each(pos, &op->subops.list)
> + {
> + subops = list_entry(pos, struct subop_struct,
> list);
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + err = info->subdev->write_ecc(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf,
> +
> subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: write
> operation failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(op->status)
> + break;
> + }
> + }
> + break;
> +
> + case MTD_STRIPE_OPCODE_READ_ECC:
> + case MTD_STRIPE_OPCODE_READ:
> + /* proceed with list head first */
> + subops = &op->subops;
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + if(op->opcode == MTD_STRIPE_OPCODE_READ_ECC)
> + {
> + err = info->subdev->read_ecc(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf,
> +
> subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
> + }
> + else
> + {
> + err = info->subdev->read(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf);
> + }
> +
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: read operation
> failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(!op->status)
> + {
> + /* now proceed each list element except head */
> + list_for_each(pos, &op->subops.list)
> + {
> + subops = list_entry(pos, struct subop_struct,
> list);
> +
> + for(i = 0; i < subops->ops_num; i++)
> + {
> + if(op->opcode == MTD_STRIPE_OPCODE_READ_ECC)
> + {
> + err =
> info->subdev->read_ecc(info->subdev, subops->ops_array[i].ofs,
> subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf,
> +
> subops->ops_array[i].eccbuf, &info->subdev->oobinfo);
> + }
> + else
> + {
> + err = info->subdev->read(info->subdev,
> subops->ops_array[i].ofs, subops->ops_array[i].len,
> + &retsize,
> subops->ops_array[i].buf);
> + }
> +
> + if(err)
> + {
> + op->status = -EINVAL;
> + printk(KERN_ERR "mtd_stripe: read
> operation failed %d\n",err);
> + break;
> + }
> + }
> +
> + if(op->status)
> + break;
> + }
> + }
> +
> + break;
> +
> + default:
> + /* unknown operation code */
> + printk(KERN_ERR "mtd_stripe: invalid operation code %d",
> op->opcode);
> + op->status = -EINVAL;
> + break;
> + };
> +
> + /* remove issued operation from the list */
> + spin_lock(&info->list_lock);
> + list_del(&op->list);
> + spin_unlock(&info->list_lock);
> +
> + /* raise semaphore to let stripe_write() or stripe_erase()
> continue */
> + up(&op->sem);
> + }
> +
> + info->thread = NULL;
> + up(&info->sw_thread_startstop);
> +}
> +
> +/* Launches write to flash thread */
> +int
> +stripe_start_write_thread(struct mtd_sw_thread_info* info, struct
> mtd_info *device)
> +{
> + pid_t pid;
> + int ret = 0;
> +
> + if(info->thread)
> + BUG();
> +
> + info->subdev = device; /* set the
> pointer to corresponding device */
> +
> + init_MUTEX_LOCKED(&info->sw_thread_startstop); /* init
> start/stop semaphore */
> + info->sw_thread = 1; /* set continue
> thread flag */
> + init_MUTEX_LOCKED(&info->sw_thread_wait); /* init "wait for data"
> semaphore */
> +
> + INIT_LIST_HEAD(&info->list); /* initialize
> operation list head */
> +
> + spin_lock_init(&info->list_lock); /* init list lock */
> +
> + pid = kernel_thread((int (*)(void *))stripe_write_thread, info,
> CLONE_KERNEL); /* flags (3rd arg) TBD */
> + if (pid < 0)
> + {
> + printk(KERN_ERR "fork failed for MTD stripe thread: %d\n",
> -pid);
> + ret = pid;
> + }
> + else
> + {
> + /* wait thread started */
> + DEBUG(MTD_DEBUG_LEVEL1, "MTD stripe: write thread has pid %d\n",
> pid);
> + down(&info->sw_thread_startstop);
> + }
> +
> + return ret;
> +}
> +
> +/* Complete write to flash thread */
> +void
> +stripe_stop_write_thread(struct mtd_sw_thread_info* info)
> +{
> + if(info->thread)
> + {
> + info->sw_thread = 0; /* drop thread flag */
> + up(&info->sw_thread_wait); /* let the thread
> complete */
> + down(&info->sw_thread_startstop); /* wait for thread
> completion */
> + DEBUG(MTD_DEBUG_LEVEL1, "MTD stripe: writing thread has been
> stopped\n");
> + }
> +}
> +
> +/* Updates write/erase thread priority to max value
> + * based on operations in the queue
> + */
> +void
> +stripe_set_write_thread_prio(struct mtd_sw_thread_info* info)
> +{
> + struct mtd_stripe_op *op;
> + int oldnice, newnice;
> + struct list_head *pos;
> +
> + newnice = oldnice = info->thread->static_prio - MAX_RT_PRIO - 20;
> +
> + spin_lock(&info->list_lock);
> + list_for_each(pos, &info->list)
> + {
> + op = list_entry(pos, struct mtd_stripe_op, list);
> + newnice = (op->op_prio < newnice) ? op->op_prio : newnice;
> + }
> + spin_unlock(&info->list_lock);
> +
> + newnice = (newnice < -20) ? -20 : newnice;
> +
> + if(oldnice != newnice)
> + set_user_nice(info->thread, newnice);
> +}
> +
> +/* add sub operation into the array
> + op - pointer to the operation structure
> + ofs - operation offset within subdevice
> + len - data to be written/erased
> + buf - pointer to the buffer with data to be written (NULL is erase
> operation)
> +
> + returns: 0 - success
> +*/
> +static inline int
> +stripe_add_subop(struct mtd_stripe_op *op, u_int32_t ofs, u_int32_t
> len, const u_char *buf, const u_char *eccbuf)
> +{
> + u_int32_t size; /* number of items in
> the new array (if any) */
> + struct subop_struct *subop;
> +
> + if(!op)
> + BUG(); /* error */
> +
> + /* get tail list element or head */
> + subop = list_entry(op->subops.list.prev, struct subop_struct,
> list);
> +
> + /* check if current suboperation array is already filled or not */
> + if(subop->ops_num >= subop->ops_num_max)
> + {
> + /* array is full. allocate new one and add to list */
> + size = SIZEOF_STRUCT_MTD_STRIPE_SUBOP(op->subops.ops_num_max);
> + subop = kmalloc(size, GFP_KERNEL);
> + if(!subop)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(subop, 0, size);
> + subop->ops_num = 0;
> + subop->ops_num_max = op->subops.ops_num_max;
> + subop->ops_array = (struct subop *)(subop + 1);
> +
> + list_add_tail(&subop->list, &op->subops.list);
> + }
> +
> + subop->ops_array[subop->ops_num].ofs = ofs;
> + subop->ops_array[subop->ops_num].len = len;
> + subop->ops_array[subop->ops_num].buf = (u_char *)buf;
> + subop->ops_array[subop->ops_num].eccbuf = (u_char *)eccbuf;
> +
> + subop->ops_num++; /* increase stored suboperations counter */
> +
> + return 0;
> +}
> +
> +/* deallocates memory allocated by stripe_add_subop routine */
> +static void
> +stripe_destroy_op(struct mtd_stripe_op *op)
> +{
> + struct subop_struct *subop;
> +
> + while(!list_empty(&op->subops.list))
> + {
> + subop = list_entry(op->subops.list.next,struct subop_struct,
> list);
> + list_del(&subop->list);
> + kfree(subop);
> + }
> +}
> +
> +/* adds new operation to the thread queue and unlock wait semaphore for
> specific thread */
> +static void
> +stripe_add_op(struct mtd_sw_thread_info* info, struct mtd_stripe_op*
> op)
> +{
> + if(!info || !op)
> + BUG();
> +
> + spin_lock(&info->list_lock);
> + list_add_tail(&op->list, &info->list);
> + spin_unlock(&info->list_lock);
> +}
> +
> +/* End of multithreading support routines */
> +
> +
> +/*
> + * MTD methods which look up the relevant subdevice, translate the
> + * effective address and pass through to the subdevice.
> + */
> +
> +
> +/* sychroneous read from striped volume */
> +static int
> +stripe_read_sync(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf)
> +{
> + u_int32_t from_loc = (u_int32_t)from; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> + size_t retsize; /* data read/written from/to
> subdev (bytes) */
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_sync(): offset = 0x%08x, size
> = %d\n", from_loc, len);
> +
> + /* Check whole striped device bounds here */
> + if(from_loc + len > mtd->size)
> + {
> + return err;
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_sync(): device = %d, offset =
> 0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
> + err =
> stripe->subdev[subdev_number]->read(stripe->subdev[subdev_number],
> subdev_offset_low, subdev_len, &retsize, buf);
> + if(!err)
> + {
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_sync(): device = %d, offset
> = 0x%08x, len = %d\n", subdev_number, subdev_offset *
> stripe->interleave_size, subdev_len);
> + err =
> stripe->subdev[subdev_number]->read(stripe->subdev[subdev_number],
> subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf);
> + if(err)
> + break;
> +
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_sync(): read %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +/* asychroneous read from striped volume */
> +static int
> +stripe_read_async(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf)
> +{
> + u_int32_t from_loc = (u_int32_t)from; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_async(): offset = 0x%08x, size
> = %d\n", from_loc, len);
> +
> + /* Check whole striped device bounds here */
> + if(from_loc + len > mtd->size)
> + {
> + return err;
> + }
> +
> + /* allocate memory for multithread operations */
> + queue_size = len / stripe->interleave_size / stripe->num_subdev +
> 1; /* default queue size. could be set to predefined value */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_READ;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* asynch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_async(): device = %d, offset =
> 0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
> subdev_len, buf, NULL);
> + if(!err)
> + {
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_async(): device = %d,
> offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
> stripe->interleave_size, subdev_len);
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset *
> stripe->interleave_size, subdev_len, buf, NULL);
> + if(err)
> + break;
> +
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + /* Push operation into the corresponding threads queue and rise
> semaphores */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + err = ops[i].status;
> + }
> +
> + /* Deallocate all memory before exit */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_async(): read %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +static int
> +stripe_read(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf)
> +{
> + int err;
> + if(mtd->type == MTD_NANDFLASH)
> + err = stripe_read_async(mtd, from, len, retlen, buf);
> + else
> + err = stripe_read_sync(mtd, from, len, retlen, buf);
> +
> + return err;
> +}
> +
> +
> +static int
> +stripe_write(struct mtd_info *mtd, loff_t to, size_t len,
> + size_t * retlen, const u_char * buf)
> +{
> + u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
> MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned block */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write(): offset = 0x%08x, size =
> %d\n", to_loc, len);
> +
> + /* check if no data is going to be written */
> + if(!len)
> + return 0;
> +
> + /* Check whole striped device bounds here */
> + if(to_loc + len > mtd->size)
> + return err;
> +
> + /* allocate memory for multithread operations */
> + queue_size = len / stripe->interleave_size / stripe->num_subdev +
> 1; /* default queue size. could be set to predefined value */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_WRITE;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(to_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
> - 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
> + subdev_number = (to_loc / stripe->interleave_size) % dev_count;
> + }
> +
> + subdev_offset_low = to_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
> subdev_len, buf, NULL);
> + if(!err)
> + {
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(to_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset *
> stripe->interleave_size, subdev_len, buf, NULL);
> + if(err)
> + break;
> +
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + if(to_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + /* Push operation into the corresponding threads queue and rise
> semaphores */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + err = ops[i].status;
> + }
> +
> + /* Deallocate all memory before exit */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write(): written %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +/* synchroneous ecc read from striped volume */
> +static int
> +stripe_read_ecc_sync(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf, u_char * eccbuf,
> + struct nand_oobinfo *oobsel)
> +{
> + u_int32_t from_loc = (u_int32_t)from; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> + size_t retsize; /* data read/written from/to
> subdev (bytes) */
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_sync(): offset = 0x%08x,
> size = %d\n", from_loc, len);
> +
> + if(oobsel != NULL)
> + {
> + /* check if oobinfo is has been chandes by FS */
> + if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
> + {
> + printk(KERN_ERR "stripe_read_ecc_sync(): oobinfo has been
> changed by FS (not supported yet)\n");
> + return err;
> + }
> + }
> +
> + /* Check whole striped device bounds here */
> + if(from_loc + len > mtd->size)
> + {
> + return err;
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_sync(): device = %d,
> offset = 0x%08x, len = %d\n", subdev_number, subdev_offset_low,
> subdev_len);
> + err =
> stripe->subdev[subdev_number]->read_ecc(stripe->subdev[subdev_number],
> subdev_offset_low, subdev_len, &retsize, buf, eccbuf,
> &stripe->subdev[subdev_number]->oobinfo);
> + if(!err)
> + {
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_sync(): device = %d,
> offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
> stripe->interleave_size, subdev_len);
> + err =
> stripe->subdev[subdev_number]->read_ecc(stripe->subdev[subdev_number],
> subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf,
> eccbuf, &stripe->subdev[subdev_number]->oobinfo);
> + if(err)
> + break;
> +
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(from + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_sync(): read %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +/* asynchroneous ecc read from striped volume */
> +static int
> +stripe_read_ecc_async(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf, u_char * eccbuf,
> + struct nand_oobinfo *oobsel)
> +{
> + u_int32_t from_loc = (u_int32_t)from; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_async(): offset = 0x%08x,
> size = %d\n", from_loc, len);
> +
> + if(oobsel != NULL)
> + {
> + /* check if oobinfo is has been chandes by FS */
> + if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
> + {
> + printk(KERN_ERR "stripe_read_ecc_async(): oobinfo has been
> changed by FS (not supported yet)\n");
> + return err;
> + }
> + }
> +
> + /* Check whole striped device bounds here */
> + if(from_loc + len > mtd->size)
> + {
> + return err;
> + }
> +
> + /* allocate memory for multithread operations */
> + queue_size = len / stripe->interleave_size / stripe->num_subdev +
> 1; /* default queue size. could be set to predefined value */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_READ_ECC;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Issue read operation here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_async(): device = %d,
> offset = 0x%08x, len = %d\n", subdev_number, subdev_offset_low,
> subdev_len);
> +
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
> subdev_len, buf, eccbuf);
> + if(!err)
> + {
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(eccbuf)
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(from_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Issue read operation here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_ecc_async(): device = %d,
> offset = 0x%08x, len = %d\n", subdev_number, subdev_offset *
> stripe->interleave_size, subdev_len);
> +
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset *
> stripe->interleave_size, subdev_len, buf, eccbuf);
> + if(err)
> + break;
> +
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(eccbuf)
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(from + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + /* Push operation into the corresponding threads queue and rise
> semaphores */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + err = ops[i].status;
> + }
> +
> + /* Deallocate all memory before exit */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_ecc_async(): read %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +static int
> +stripe_read_ecc(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf, u_char * eccbuf,
> + struct nand_oobinfo *oobsel)
> +{
> + int err;
> + if(mtd->type == MTD_NANDFLASH)
> + err = stripe_read_ecc_async(mtd, from, len, retlen, buf, eccbuf,
> oobsel);
> + else
> + err = stripe_read_ecc_sync(mtd, from, len, retlen, buf, eccbuf,
> oobsel);
> +
> + return err;
> +}
> +
> +
> +static int
> +stripe_write_ecc(struct mtd_info *mtd, loff_t to, size_t len,
> + size_t * retlen, const u_char * buf, u_char * eccbuf,
> + struct nand_oobinfo *oobsel)
> +{
> + u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
> MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned block */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_ecc(): offset = 0x%08x, size
> = %d\n", to_loc, len);
> +
> + if(oobsel != NULL)
> + {
> + /* check if oobinfo is has been chandes by FS */
> + if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
> + {
> + printk(KERN_ERR "stripe_write_ecc(): oobinfo has been
> changed by FS (not supported yet)\n");
> + return err;
> + }
> + }
> +
> + /* check if no data is going to be written */
> + if(!len)
> + return 0;
> +
> + /* Check whole striped device bounds here */
> + if(to_loc + len > mtd->size)
> + return err;
> +
> + /* allocate memory for multithread operations */
> + queue_size = len / stripe->interleave_size / stripe->num_subdev +
> 1; /* default queue size */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_WRITE_ECC;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(to_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
> - 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
> + subdev_number = (to_loc / stripe->interleave_size) % dev_count;
> + }
> +
> + subdev_offset_low = to_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
> subdev_len, buf, eccbuf);
> + if(!err)
> + {
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(eccbuf)
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(to_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset *
> stripe->interleave_size, subdev_len, buf, eccbuf);
> + if(err)
> + break;
> +
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> + if(eccbuf)
> + eccbuf += stripe->subdev[subdev_number]->oobavail;
> +
> + if(to_loc + *retlen >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + /* Push operation into the corresponding threads queue and rise
> semaphores */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + err = ops[i].status;
> + }
> +
> + /* Deallocate all memory before exit */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_ecc(): written %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +
> +static int
> +stripe_read_oob(struct mtd_info *mtd, loff_t from, size_t len,
> + size_t * retlen, u_char * buf)
> +{
> + u_int32_t from_loc = (u_int32_t)from; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> + size_t retsize; /* data read/written from/to
> subdev (bytes) */
> +
> + u_int32_t subdev_oobavail = stripe->subdev[0]->oobsize;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_oob(): offset = 0x%08x, size =
> %d\n", from_loc, len);
> +
> + /* Check whole striped device bounds here */
> + if(from_loc + len > mtd->size)
> + {
> + return err;
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % subdev_oobavail;
> + subdev_len = (len_left < (subdev_oobavail - subdev_offset_low)) ?
> len_left : (subdev_oobavail - subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_oob(): device = %d, offset =
> 0x%08x, len = %d\n", subdev_number, subdev_offset_low, subdev_len);
> + err =
> stripe->subdev[subdev_number]->read_oob(stripe->subdev[subdev_number],
> subdev_offset_low, subdev_len, &retsize, buf);
> + if(!err)
> + {
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + /* increase flash offset by interleave size since oob blocks
> + * aligned with page size (i.e. interleave size) */
> + from_loc += stripe->interleave_size;
> +
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < subdev_oobavail) ? len_left :
> subdev_oobavail;
> +
> + /* Synch read here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_read_oob(): device = %d, offset
> = 0x%08x, len = %d\n", subdev_number, subdev_offset *
> stripe->interleave_size, subdev_len);
> + err =
> stripe->subdev[subdev_number]->read_oob(stripe->subdev[subdev_number],
> subdev_offset * stripe->interleave_size, subdev_len, &retsize, buf);
> + if(err)
> + break;
> +
> + *retlen += retsize;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + /* increase flash offset by interleave size since oob blocks
> + * aligned with page size (i.e. interleave size) */
> + from_loc += stripe->interleave_size;
> +
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_read_oob(): read %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +static int
> +stripe_write_oob(struct mtd_info *mtd, loff_t to, size_t len,
> + size_t *retlen, const u_char * buf)
> +{
> + u_int32_t to_loc = (u_int32_t)to; /* we can do this since whole
> MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned block */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to read/write
> left (bytes) */
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + //u_int32_t subdev_oobavail = stripe->subdev[0]->oobavail;
> + u_int32_t subdev_oobavail = stripe->subdev[0]->oobsize;
> +
> + *retlen = 0;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_oob(): offset = 0x%08x, size
> = %d\n", to_loc, len);
> +
> + /* check if no data is going to be written */
> + if(!len)
> + return 0;
> +
> + /* Check whole striped device bounds here */
> + if(to_loc + len > mtd->size)
> + return err;
> +
> + /* allocate memory for multithread operations */
> + queue_size = len / subdev_oobavail / stripe->num_subdev + 1;
> /* default queue size. could be set to predefined value */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "stripe_write_oob(): memory allocation
> error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_WRITE_OOB;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(to_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((to_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((to_loc - stripe->subdev_last_offset[i
> - 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (to_loc / stripe->interleave_size) / dev_count;
> + subdev_number = (to_loc / stripe->interleave_size) % dev_count;
> + }
> +
> + subdev_offset_low = to_loc % subdev_oobavail;
> + subdev_len = (len_left < (subdev_oobavail - subdev_offset_low)) ?
> len_left : (subdev_oobavail - subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset_low,
> subdev_len, buf, NULL);
> +
>
> + if(!err)
> + {
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + /* increase flash offset by interleave size since oob blocks
> + * aligned with page size (i.e. interleave size) */
> + to_loc += stripe->interleave_size;
> +
> + if(to_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < subdev_oobavail) ? len_left :
> subdev_oobavail;
> +
> + /* Add suboperation to queue here */
> + err = stripe_add_subop(&ops[subdev_number], subdev_offset *
> stripe->interleave_size, subdev_len, buf, NULL);
> + if(err)
> + break;
> +
> + *retlen += subdev_len;
> + len_left -= subdev_len;
> + buf += subdev_len;
> +
> + /* increase flash offset by interleave size since oob blocks
> + * aligned with page size (i.e. interleave size) */
> + to_loc += stripe->interleave_size;
> +
> + if(to_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + /* Push operation into the corresponding threads queue and rise
> semaphores */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + err = ops[i].status;
> + }
> +
> + /* Deallocate all memory before exit */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_write_oob(): written %d bytes\n",
> *retlen);
> + return err;
> +}
> +
> +/* this routine aimed to support striping on NOR_ECC
> + * it has been taken from cfi_cmdset_0001.c
> + */
> +static int
> +stripe_writev (struct mtd_info *mtd, const struct kvec *vecs, unsigned
> long count,
> + loff_t to, size_t * retlen)
> +{
> + int i, page, len, total_len, ret = 0, written = 0, cnt = 0,
> towrite;
> + u_char *bufstart;
> + char* data_poi;
> + char* data_buf;
> + loff_t write_offset;
> + int rl_wr;
> +
> + u_int32_t pagesize;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "==> stripe_writev()\n");
> +
> +#ifdef MTD_PROGRAM_REGIONS
> + /* Montavista patch for Sibley support detected */
> + if(mtd->flags & MTD_PROGRAM_REGIONS)
> + {
> + pagesize = MTD_PROGREGION_SIZE(mtd);
> + }
> + else if(mtd->flags & MTD_ECC)
> + {
> + pagesize = mtd->eccsize;
> + }
> + else
> + {
> + printk(KERN_ERR "stripe_writev() has been called for device
> without MTD_PROGRAM_REGIONS or MTD_ECC set\n");
> + return -EINVAL;
> + }
> +#else
> + if(mtd->flags & MTD_ECC)
> + {
> + pagesize = mtd->eccsize;
> + }
> + else
> + {
> + printk(KERN_ERR "stripe_writev() has been called for device
> without MTD_ECC set\n");
> + return -EINVAL;
> + }
> +#endif
> +
> + data_buf = kmalloc(pagesize, GFP_KERNEL);
> +
> + /* Preset written len for early exit */
> + *retlen = 0;
> +
> + /* Calculate total length of data */
> + total_len = 0;
> + for (i = 0; i < count; i++)
> + total_len += (int) vecs[i].iov_len;
> +
> + /* check if no data is going to be written */
> + if(!total_len)
> + {
> + kfree(data_buf);
> + return 0;
> + }
> +
> + /* Do not allow write past end of page */
> + if ((to + total_len) > mtd->size) {
> + DEBUG (MTD_DEBUG_LEVEL0, "stripe_writev(): Attempted write past
> end of device\n");
> + kfree(data_buf);
> + return -EINVAL;
> + }
> +
> + /* Setup start page */
> + page = ((int) to) / pagesize;
> + towrite = (page + 1) * pagesize - to; /* rest of the page */
> + write_offset = to;
> + written = 0;
> + /* Loop until all iovecs' data has been written */
> + len = 0;
> + while (len < total_len) {
> + bufstart = (u_char *)vecs->iov_base;
> + bufstart += written;
> + data_poi = bufstart;
> +
> + /* If the given tuple is >= reet of page then
> + * write it out from the iov
> + */
> + if ( (vecs->iov_len-written) >= towrite) { /* The fastest
> case is to write data by int * blocksize */
> + ret = mtd->write(mtd, write_offset, towrite, &rl_wr,
> data_poi);
> + if(ret)
> + break;
> + len += towrite;
> + page ++;
> + write_offset = page * pagesize;
> + towrite = pagesize;
> + written += towrite;
> + if(vecs->iov_len == written) {
> + vecs ++;
> + written = 0;
> + }
> + }
> + else
> + {
> + cnt = 0;
> + while(cnt < towrite ) {
> + data_buf[cnt++] = ((u_char *)
> vecs->iov_base)[written++];
> + if(vecs->iov_len == written )
> + {
> + if((cnt+len) == total_len )
> + break;
> + vecs ++;
> + written = 0;
> + }
> + }
> + data_poi = data_buf;
> + ret = mtd->write(mtd, write_offset, cnt, &rl_wr, data_poi);
> + if (ret)
> + break;
> + len += cnt;
> + page ++;
> + write_offset = page * pagesize;
> + towrite = pagesize;
> + }
> + }
> +
> + if(retlen)
> + *retlen = len;
> + kfree(data_buf);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_writev()\n");
> +
> + return ret;
> +}
> +
> +
> +static int
> +stripe_writev_ecc (struct mtd_info *mtd, const struct kvec *vecs,
> unsigned long count,
> + loff_t to, size_t * retlen, u_char *eccbuf, struct
> nand_oobinfo *oobsel)
> +{
> + int i, page, len, total_len, ret = 0, written = 0, cnt = 0,
> towrite;
> + u_char *bufstart;
> + char* data_poi;
> + char* data_buf;
> + loff_t write_offset;
> + data_buf = kmalloc(mtd->oobblock, GFP_KERNEL);
> + int rl_wr;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "==> stripe_writev_ecc()\n");
> +
> + if(oobsel != NULL)
> + {
> + /* check if oobinfo is has been chandes by FS */
> + if(memcmp(oobsel, &mtd->oobinfo, sizeof(struct nand_oobinfo)))
> + {
> + printk(KERN_ERR "stripe_writev_ecc(): oobinfo has been
> changed by FS (not supported yet)\n");
> + kfree(data_buf);
> + return -EINVAL;
> + }
> + }
> +
> + if(!(mtd->flags & MTD_ECC))
> + {
> + printk(KERN_ERR "stripe_writev_ecc() has been called for device
> without MTD_ECC set\n");
> + kfree(data_buf);
> + return -EINVAL;
> + }
> +
> + /* Preset written len for early exit */
> + *retlen = 0;
> +
> + /* Calculate total length of data */
> + total_len = 0;
> + for (i = 0; i < count; i++)
> + total_len += (int) vecs[i].iov_len;
> +
> + /* check if no data is going to be written */
> + if(!total_len)
> + {
> + kfree(data_buf);
> + return 0;
> + }
> +
> + /* Do not allow write past end of page */
> + if ((to + total_len) > mtd->size) {
> + DEBUG (MTD_DEBUG_LEVEL0, "stripe_writev_ecc(): Attempted write
> past end of device\n");
> + kfree(data_buf);
> + return -EINVAL;
> + }
> +
> + /* Check "to" and "len" alignment here */
> + /* NOTE: can't use if(to & (mtd->ooblock - 1)) alignment check here
> since
> + * mtd->oobblock can be not-power-of-two number */
> + if((((int) to) % mtd->oobblock) || (total_len % mtd->oobblock))
> + {
> + printk(KERN_ERR "stripe_writev_ecc(): Attempted write not
> aligned data!\n");
> + kfree(data_buf);
> + return -EINVAL;
> + }
> +
> + /* Setup start page. Notaligned data is not allowed for write_ecc.
> */
> + page = ((int) to) / mtd->oobblock;
> + towrite = (page + 1) * mtd->oobblock - to; /* aligned with
> oobblock */
> + write_offset = to;
> + written = 0;
> + /* Loop until all iovecs' data has been written */
> + len = 0;
> + while (len < total_len) {
> + bufstart = (u_char *)vecs->iov_base;
> + bufstart += written;
> + data_poi = bufstart;
> +
> + /* If the given tuple is >= reet of page then
> + * write it out from the iov
> + */
> + if ( (vecs->iov_len-written) >= towrite) { /* The fastest
> case is to write data by int * blocksize */
> + ret = mtd->write_ecc(mtd, write_offset, towrite, &rl_wr,
> data_poi, eccbuf, oobsel);
> + if(ret)
> + break;
> + len += rl_wr;
> + page ++;
> + write_offset = page * mtd->oobblock;
> + towrite = mtd->oobblock;
> + written += towrite;
> + if(vecs->iov_len == written) {
> + vecs ++;
> + written = 0;
> + }
> +
> + if(eccbuf)
> + eccbuf += mtd->oobavail;
> + }
> + else
> + {
> + cnt = 0;
> + while(cnt < towrite ) {
> + data_buf[cnt++] = ((u_char *)
> vecs->iov_base)[written++];
> + if(vecs->iov_len == written )
> + {
> + if((cnt+len) == total_len )
> + break;
> + vecs ++;
> + written = 0;
> + }
> + }
> + data_poi = data_buf;
> + ret = mtd->write_ecc(mtd, write_offset, cnt, &rl_wr,
> data_poi, eccbuf, oobsel);
> + if (ret)
> + break;
> + len += rl_wr;
> + page ++;
> + write_offset = page * mtd->oobblock;
> + towrite = mtd->oobblock;
> +
> + if(eccbuf)
> + eccbuf += mtd->oobavail;
> + }
> + }
> +
> + if(retlen)
> + *retlen = len;
> + kfree(data_buf);
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_writev_ecc()\n");
> +
> + return ret;
> +}
> +
> +
> +static void
> +stripe_erase_callback(struct erase_info *instr)
> +{
> + wake_up((wait_queue_head_t *) instr->priv);
> +}
> +
> +static int
> +stripe_dev_erase(struct mtd_info *mtd, struct erase_info *erase)
> +{
> + int err;
> + wait_queue_head_t waitq;
> + DECLARE_WAITQUEUE(wait, current);
> +
> + init_waitqueue_head(&waitq);
> +
> + erase->mtd = mtd;
> + erase->callback = stripe_erase_callback;
> + erase->priv = (unsigned long) &waitq;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_dev_erase(): addr=0x%08x,
> len=%d\n", erase->addr, erase->len);
> +
> + /*
> + * FIXME: Allow INTERRUPTIBLE. Which means
> + * not having the wait_queue head on the stack.
> + */
> + err = mtd->erase(mtd, erase);
> + if (!err)
> + {
> + set_current_state(TASK_UNINTERRUPTIBLE);
> + add_wait_queue(&waitq, &wait);
> + if (erase->state != MTD_ERASE_DONE
> + && erase->state != MTD_ERASE_FAILED)
> + schedule();
> + remove_wait_queue(&waitq, &wait);
> + set_current_state(TASK_RUNNING);
> +
> + err = (erase->state == MTD_ERASE_FAILED) ? -EIO : 0;
> + }
> + return err;
> +}
> +
> +static int
> +stripe_erase(struct mtd_info *mtd, struct erase_info *instr)
> +{
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int i, err;
> + struct mtd_stripe_erase_bounds *erase_bounds;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to erase
> (bytes) */
> + size_t subdev_len; /* data size to be erased at
> this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left; /* total data size left to be
> erased (bytes) */
> + size_t len_done; /* total data size erased */
> + u_int32_t from;
> +
> + struct mtd_stripe_op *ops; /* operations array (one per
> thread) */
> + u_int32_t size; /* amount of memory to be
> allocated for thread operations */
> + u_int32_t queue_size;
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_earse(): addr=0x%08x, len=%d\n",
> instr->addr, instr->len);
> +
> + if(!(mtd->flags & MTD_WRITEABLE))
> + return -EROFS;
> +
> + if(instr->addr > stripe->mtd.size)
> + return -EINVAL;
> +
> + if(instr->len + instr->addr > stripe->mtd.size)
> + return -EINVAL;
> +
> + /*
> + * Check for proper erase block alignment of the to-be-erased area.
> + */
> + if(!stripe->mtd.numeraseregions)
> + {
> + /* striped device has uniform erase block size */
> + /* NOTE: can't use if(instr->addr & (stripe->mtd.erasesize - 1))
> alignment check here
> + * since stripe->mtd.erasesize can be not-power-of-two number */
> + if(instr->addr % stripe->mtd.erasesize || instr->len %
> stripe->mtd.erasesize)
> + return -EINVAL;
> + }
> + else
> + {
> + /* we should not get here */
> + return -EINVAL;
> + }
> +
> + instr->fail_addr = 0xffffffff;
> +
> + /* allocate memory for multithread operations */
> + queue_size = 1; /* queue size for erase opration is 1 */
> + size = stripe->num_subdev *
> SIZEOF_STRUCT_MTD_STRIPE_OP(queue_size);
> + ops = kmalloc(size, GFP_KERNEL);
> + if(!ops)
> + {
> + printk(KERN_ERR "mtd_stripe: memory allocation error!\n");
> + return -ENOMEM;
> + }
> +
> + memset(ops, 0, size);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + ops[i].opcode = MTD_STRIPE_OPCODE_ERASE;
> + ops[i].caller_id = 0; /* TBD */
> + init_MUTEX_LOCKED(&ops[i].sem); /* mutex is locked here.
> to be unlocked by device thread */
> + //ops[i].status = 0; /* TBD */
> + ops[i].fail_addr = 0xffffffff;
> +
> + INIT_LIST_HEAD(&ops[i].subops.list); /* initialize
> suboperation list head */
> +
> + ops[i].subops.ops_num = 0; /* to be increased later
> here */
> + ops[i].subops.ops_num_max = queue_size; /* total number of
> suboperations can be stored in the array */
> + ops[i].subops.ops_array = (struct subop *)((char *)(ops +
> stripe->num_subdev) + i * queue_size * sizeof(struct subop));
> + }
> +
> + len_left = instr->len;
> + len_done = 0;
> + from = instr->addr;
> +
> + /* allocate memory for erase boundaries for all subdevices */
> + erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
> mtd_stripe_erase_bounds), GFP_KERNEL);
> + if(!erase_bounds)
> + {
> + kfree(ops);
> + return -ENOMEM;
> + }
> + memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
> stripe->num_subdev);
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from - stripe->subdev_last_offset[i - 1])
> / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from / stripe->interleave_size) / dev_count;
> + subdev_number = (from / stripe->interleave_size) % dev_count;
> + }
> +
> + /* Should by optimized for erase op */
> + subdev_offset_low = from % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add/extend block-to-be erased */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset_low;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> + len_left -= subdev_len;
> + len_done += subdev_len;
> +
> + if(from + len_done >= stripe->subdev_last_offset[stripe->num_subdev
> - dev_count])
> + dev_count--;
> +
> + while(len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size; /* can by optimized for erase op*/
> +
> + /* Add/extend block-to-be erased */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset *
> stripe->interleave_size;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> + len_left -= subdev_len;
> + len_done += subdev_len;
> +
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_erase(): device = %d, addr =
> 0x%08x, len = %d\n", subdev_number, erase_bounds[subdev_number].addr,
> erase_bounds[subdev_number].len);
> +
> + if(from + len_done >=
> stripe->subdev_last_offset[stripe->num_subdev - dev_count])
> + dev_count--;
> + }
> +
> + /* now do the erase: */
> + err = 0;
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + if(erase_bounds[i].need_erase)
> + {
> + if (!(stripe->subdev[i]->flags & MTD_WRITEABLE))
> + {
> + err = -EROFS;
> + break;
> + }
> +
> + stripe_add_subop(&ops[i], erase_bounds[i].addr,
> erase_bounds[i].len, (u_char *)instr, NULL);
> + }
> + }
> +
> + /* Push operation queues into the corresponding threads */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + if(erase_bounds[i].need_erase)
> + {
> + stripe_add_op(&stripe->sw_threads[i], &ops[i]);
> +
> + /* set original operation priority */
> + ops[i].op_prio = current->static_prio - MAX_RT_PRIO - 20;
> + stripe_set_write_thread_prio(&stripe->sw_threads[i]);
> +
> + up(&stripe->sw_threads[i].sw_thread_wait);
> + }
> + }
> +
> + /* wait for all suboperations completed and check status */
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + if(erase_bounds[i].need_erase)
> + {
> + down(&ops[i].sem);
> +
> + /* set error if one of operations has failed */
> + if(ops[i].status)
> + {
> + err = ops[i].status;
> +
> + /* FIX ME: For now this adddres shows address
> + * at the last failed subdevice,
> + * but not at the "super" device */
> + if(ops[i].fail_addr != 0xffffffff)
> + instr->fail_addr = ops[i].fail_addr;
> + }
> +
> + instr->state = ops[i].state;
> + }
> + }
> +
> + /* Deallocate all memory before exit */
> + kfree(erase_bounds);
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + stripe_destroy_op(&ops[i]);
> + }
> + kfree(ops);
> +
> + if(err)
> + return err;
> +
> + if(instr->callback)
> + instr->callback(instr);
> + return 0;
> +}
> +
> +static int
> +stripe_lock(struct mtd_info *mtd, loff_t ofs, size_t len)
> +{
> + u_int32_t ofs_loc = (u_int32_t)ofs; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to lock
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be locked @
> subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to lock left
> (bytes) */
> +
> + size_t retlen = 0;
> + struct mtd_stripe_erase_bounds *erase_bounds;
> +
> + /* Check whole striped device bounds here */
> + if(ofs_loc + len > mtd->size)
> + return err;
> +
> + /* allocate memory for lock boundaries for all subdevices */
> + erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
> mtd_stripe_erase_bounds), GFP_KERNEL);
> + if(!erase_bounds)
> + return -ENOMEM;
> + memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
> stripe->num_subdev);
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(ofs_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((ofs_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((ofs_loc - stripe->subdev_last_offset[i
> - 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (ofs_loc / stripe->interleave_size) / dev_count;
> + subdev_number = (ofs_loc / stripe->interleave_size) % dev_count;
> + }
> +
> + subdev_offset_low = ofs_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add/extend block-to-be locked */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset_low;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> +
> + retlen += subdev_len;
> + len_left -= subdev_len;
> + if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> +
> + while(len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Add/extend block-to-be locked */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset *
> stripe->interleave_size;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> +
> + retlen += subdev_len;
> + len_left -= subdev_len;
> +
> + if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev
> - dev_count])
> + dev_count--;
> + }
> +
> + /* now do lock */
> + err = 0;
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + if(erase_bounds[i].need_erase)
> + {
> + if (stripe->subdev[i]->lock)
> + {
> + err = stripe->subdev[i]->lock(stripe->subdev[i],
> erase_bounds[i].addr, erase_bounds[i].len);
> + if(err)
> + break;
> + };
> + }
> + }
> +
> + /* Free allocated memory here */
> + kfree(erase_bounds);
> +
> + return err;
> +}
> +
> +static int
> +stripe_unlock(struct mtd_info *mtd, loff_t ofs, size_t len)
> +{
> + u_int32_t ofs_loc = (u_int32_t)ofs; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to unlock
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be unlocked @
> subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = len; /* total data size to unlock
> left (bytes) */
> +
> + size_t retlen = 0;
> + struct mtd_stripe_erase_bounds *erase_bounds;
> +
> + /* Check whole striped device bounds here */
> + if(ofs_loc + len > mtd->size)
> + return err;
> +
> + /* allocate memory for unlock boundaries for all subdevices */
> + erase_bounds = kmalloc(stripe->num_subdev * sizeof(struct
> mtd_stripe_erase_bounds), GFP_KERNEL);
> + if(!erase_bounds)
> + return -ENOMEM;
> + memset(erase_bounds, 0, sizeof(struct mtd_stripe_erase_bounds) *
> stripe->num_subdev);
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(ofs_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((ofs_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((ofs_loc - stripe->subdev_last_offset[i
> - 1]) / stripe->interleave_size) % dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (ofs_loc / stripe->interleave_size) / dev_count;
> + subdev_number = (ofs_loc / stripe->interleave_size) % dev_count;
> + }
> +
> + subdev_offset_low = ofs_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* Add/extend block-to-be unlocked */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset_low;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> +
> + retlen += subdev_len;
> + len_left -= subdev_len;
> + if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> +
> + while(len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* Add/extend block-to-be unlocked */
> + if(!erase_bounds[subdev_number].need_erase)
> + {
> + erase_bounds[subdev_number].need_erase = 1;
> + erase_bounds[subdev_number].addr = subdev_offset *
> stripe->interleave_size;
> + }
> + erase_bounds[subdev_number].len += subdev_len;
> +
> + retlen += subdev_len;
> + len_left -= subdev_len;
> +
> + if(ofs + retlen >= stripe->subdev_last_offset[stripe->num_subdev
> - dev_count])
> + dev_count--;
> + }
> +
> + /* now do unlock */
> + err = 0;
> + for(i = 0; i < stripe->num_subdev; i++)
> + {
> + if(erase_bounds[i].need_erase)
> + {
> + if (stripe->subdev[i]->unlock)
> + {
> + err = stripe->subdev[i]->unlock(stripe->subdev[i],
> erase_bounds[i].addr, erase_bounds[i].len);
> + if(err)
> + break;
> + };
> + }
> + }
> +
> + /* Free allocated memory here */
> + kfree(erase_bounds);
> +
> + return err;
> +}
> +
> +static void
> +stripe_sync(struct mtd_info *mtd)
> +{
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int i;
> +
> + for (i = 0; i < stripe->num_subdev; i++)
> + {
> + struct mtd_info *subdev = stripe->subdev[i];
> + if (subdev->sync)
> + subdev->sync(subdev);
> + }
> +}
> +
> +static int
> +stripe_suspend(struct mtd_info *mtd)
> +{
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int i, rc = 0;
> +
> + for (i = 0; i < stripe->num_subdev; i++)
> + {
> + struct mtd_info *subdev = stripe->subdev[i];
> + if (subdev->suspend)
> + {
> + if ((rc = subdev->suspend(subdev)) < 0)
> + return rc;
> + };
> + }
> + return rc;
> +}
> +
> +static void
> +stripe_resume(struct mtd_info *mtd)
> +{
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int i;
> +
> + for (i = 0; i < stripe->num_subdev; i++)
> + {
> + struct mtd_info *subdev = stripe->subdev[i];
> + if (subdev->resume)
> + subdev->resume(subdev);
> + }
> +}
> +
> +static int
> +stripe_block_isbad(struct mtd_info *mtd, loff_t ofs)
> +{
> + u_int32_t from_loc = (u_int32_t)ofs; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int res = 0;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = mtd->oobblock; /* total data size to read/write
> left (bytes) */
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_block_isbad(): offset = 0x%08x\n",
> from_loc);
> +
> + from_loc = (from_loc / mtd->oobblock) * mtd->oobblock; /* align
> offset here */
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* check block on subdevice is bad here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_isbad(): device = %d, offset
> = 0x%08x\n", subdev_number, subdev_offset_low);
> + res =
> stripe->subdev[subdev_number]->block_isbad(stripe->subdev[subdev_number]
> , subdev_offset_low);
> + if(!res)
> + {
> + len_left -= subdev_len;
> + from_loc += subdev_len;
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + while(!res && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* check block on subdevice is bad here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_isbad(): device = %d,
> offset = 0x%08x\n", subdev_number, subdev_offset *
> stripe->interleave_size);
> + res =
> stripe->subdev[subdev_number]->block_isbad(stripe->subdev[subdev_number]
> , subdev_offset * stripe->interleave_size);
> + if(res)
> + {
> + break;
> + }
> + else
> + {
> + len_left -= subdev_len;
> + from_loc += subdev_len;
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev
> - dev_count])
> + dev_count--;
> + }
> + }
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_block_isbad()\n");
> + return res;
> +}
> +
> +/* returns 0 - success */
> +static int
> +stripe_block_markbad(struct mtd_info *mtd, loff_t ofs)
> +{
> + u_int32_t from_loc = (u_int32_t)ofs; /* we can do this since
> whole MTD size in current implementation has u_int32_t type */
> +
> + struct mtd_stripe *stripe = STRIPE(mtd);
> + int err = -EINVAL;
> + int i;
> +
> + u_int32_t subdev_offset; /* equal size subdevs offset
> (interleaved block size count)*/
> + u_int32_t subdev_number; /* number of current subdev */
> + u_int32_t subdev_offset_low; /* subdev offset to read/write
> (bytes). used for "first" probably unaligned with erasesize data block
> */
> + size_t subdev_len; /* data size to be read/written
> from/to subdev at this turn (bytes) */
> + int dev_count; /* equal size subdev count */
> + size_t len_left = mtd->oobblock; /* total data size to read/write
> left (bytes) */
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "stripe_block_markbad(): offset =
> 0x%08x\n", from_loc);
> +
> + from_loc = (from_loc / mtd->oobblock) * mtd->oobblock; /* align
> offset here */
> +
> + /* Locate start position and corresponding subdevice number */
> + subdev_offset = 0;
> + subdev_number = 0;
> + dev_count = stripe->num_subdev;
> + for(i = (stripe->num_subdev - 1); i > 0; i--)
> + {
> + if(from_loc >= stripe->subdev_last_offset[i-1])
> + {
> + dev_count = stripe->num_subdev - i; /* get "equal size"
> devices count */
> + subdev_offset = stripe->subdev[i - 1]->size /
> stripe->interleave_size - 1;
> + subdev_offset += ((from_loc - stripe->subdev_last_offset[i -
> 1]) / stripe->interleave_size) / dev_count;
> + subdev_number = i + ((from_loc -
> stripe->subdev_last_offset[i - 1]) / stripe->interleave_size) %
> dev_count;
> + break;
> + }
> + }
> +
> + if(subdev_offset == 0)
> + {
> + subdev_offset = (from_loc / stripe->interleave_size) /
> dev_count;
> + subdev_number = (from_loc / stripe->interleave_size) %
> dev_count;
> + }
> +
> + subdev_offset_low = from_loc % stripe->interleave_size;
> + subdev_len = (len_left < (stripe->interleave_size -
> subdev_offset_low)) ? len_left : (stripe->interleave_size -
> subdev_offset_low);
> + subdev_offset_low += subdev_offset * stripe->interleave_size;
> +
> + /* check block on subdevice is bad here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_markbad(): device = %d,
> offset = 0x%08x\n", subdev_number, subdev_offset_low);
> + err =
> stripe->subdev[subdev_number]->block_markbad(stripe->subdev[subdev_numbe
> r], subdev_offset_low);
> + if(!err)
> + {
> + len_left -= subdev_len;
> + from_loc += subdev_len;
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev -
> dev_count])
> + dev_count--;
> + }
> +
> + while(!err && len_left > 0 && dev_count > 0)
> + {
> + subdev_number++;
> + if(subdev_number >= stripe->num_subdev)
> + {
> + subdev_number = stripe->num_subdev - dev_count;
> + subdev_offset++;
> + }
> + subdev_len = (len_left < stripe->interleave_size) ? len_left :
> stripe->interleave_size;
> +
> + /* check block on subdevice is bad here */
> + DEBUG(MTD_DEBUG_LEVEL3, "stripe_block_markbad(): device = %d,
> offset = 0x%08x\n", subdev_number, subdev_offset *
> stripe->interleave_size);
> + err =
> stripe->subdev[subdev_number]->block_markbad(stripe->subdev[subdev_numbe
> r], subdev_offset * stripe->interleave_size);
> + if(err)
> + {
> + break;
> + }
> + else
> + {
> + len_left -= subdev_len;
> + from_loc += subdev_len;
> + if(from_loc >= stripe->subdev_last_offset[stripe->num_subdev
> - dev_count])
> + dev_count--;
> + }
> + }
> +
> + DEBUG(MTD_DEBUG_LEVEL2, "<== stripe_block_markbad()\n");
> + return err;
> +}
> +
> +/*
> + * This function constructs a virtual MTD device by interleaving
> (striping)
> + * num_devs MTD devices. A pointer to the new device object is
> + * stored to *new_dev upon success. This function does _not_
> + * register any devices: this is the caller's responsibility.
> + */
> +struct mtd_info *mtd_stripe_create(struct mtd_info *subdev[], /*
> subdevices to stripe */
> + int num_devs, /*
> number of subdevices */
> + char *name, /* name
> for the new device */
> + int interleave_size) /*
> interleaving size (sanity check is required) */
> +{
> + int i,j;
> + size_t size;
> + struct mtd_stripe *stripe;
> + u_int32_t curr_erasesize;
> + int sort_done = 0;
> +
> + printk(KERN_NOTICE "Striping MTD devices:\n");
> + for (i = 0; i < num_devs; i++)
> + printk(KERN_NOTICE "(%d): \"%s\"\n", i, subdev[i]->name);
> + printk(KERN_NOTICE "into device \"%s\"\n", name);
> +
> + /* check if trying to stripe same device */
> + for(i = 0; i < num_devs; i++)
> + {
> + for(j = i; j < num_devs; j++)
> + {
> + if(i != j && !(strcmp(subdev[i]->name,subdev[j]->name)))
> + {
> + printk(KERN_ERR "MTD Stripe failed. The same subdevice
> names were found.\n");
> + return NULL;
> + }
> + }
> + }
> +
> + /* allocate the device structure */
> + size = SIZEOF_STRUCT_MTD_STRIPE(num_devs);
> + stripe = kmalloc(size, GFP_KERNEL);
> + if (!stripe)
> + {
> + printk(KERN_ERR "mtd_stripe_create(): memory allocation
> error\n");
> + return NULL;
> + }
> + memset(stripe, 0, size);
> + stripe->subdev = (struct mtd_info **) (stripe + 1);
> + stripe->subdev_last_offset = (u_int32_t *) ((char *)(stripe + 1) +
> num_devs * sizeof(struct mtd_info *));
> + stripe->sw_threads = (struct mtd_sw_thread_info *)((char *)(stripe
> + 1) + num_devs * sizeof(struct mtd_info *) + num_devs *
> sizeof(u_int32_t));
> +
> + /*
> + * Set up the new "super" device's MTD object structure, check for
> + * incompatibilites between the subdevices.
> + */
> + stripe->mtd.type = subdev[0]->type;
> + stripe->mtd.flags = subdev[0]->flags;
> + stripe->mtd.size = subdev[0]->size;
> + stripe->mtd.erasesize = subdev[0]->erasesize;
> + stripe->mtd.oobblock = subdev[0]->oobblock;
> + stripe->mtd.oobsize = subdev[0]->oobsize;
> + stripe->mtd.oobavail = subdev[0]->oobavail;
> + stripe->mtd.ecctype = subdev[0]->ecctype;
> + stripe->mtd.eccsize = subdev[0]->eccsize;
> + if (subdev[0]->read_ecc)
> + stripe->mtd.read_ecc = stripe_read_ecc;
> + if (subdev[0]->write_ecc)
> + stripe->mtd.write_ecc = stripe_write_ecc;
> + if (subdev[0]->read_oob)
> + stripe->mtd.read_oob = stripe_read_oob;
> + if (subdev[0]->write_oob)
> + stripe->mtd.write_oob = stripe_write_oob;
> +
> + stripe->subdev[0] = subdev[0];
> +
> + for(i = 1; i < num_devs; i++)
> + {
> + /*
> + * Check device compatibility,
> + */
> + if(stripe->mtd.type != subdev[i]->type)
> + {
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): incompatible device
> type on \"%s\"\n",
> + subdev[i]->name);
> + return NULL;
> + }
> +
> + /*
> + * Check MTD flags
> + */
> + if(stripe->mtd.flags != subdev[i]->flags)
> + {
> + /*
> + * Expect all flags to be
> + * equal on all subdevices.
> + */
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): incompatible device
> flags on \"%s\"\n",
> + subdev[i]->name);
> + return NULL;
> + }
> +
> + stripe->mtd.size += subdev[i]->size;
> +
> + /*
> + * Check OOB and ECC data
> + */
> + if (stripe->mtd.oobblock != subdev[i]->oobblock ||
> + stripe->mtd.oobsize != subdev[i]->oobsize ||
> + stripe->mtd.oobavail != subdev[i]->oobavail ||
> + stripe->mtd.ecctype != subdev[i]->ecctype ||
> + stripe->mtd.eccsize != subdev[i]->eccsize ||
> + !stripe->mtd.read_ecc != !subdev[i]->read_ecc ||
> + !stripe->mtd.write_ecc != !subdev[i]->write_ecc ||
> + !stripe->mtd.read_oob != !subdev[i]->read_oob ||
> + !stripe->mtd.write_oob != !subdev[i]->write_oob)
> + {
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): incompatible OOB or
> ECC data on \"%s\"\n",
> + subdev[i]->name);
> + return NULL;
> + }
> + stripe->subdev[i] = subdev[i];
> + }
> +
> + stripe->num_subdev = num_devs;
> + stripe->mtd.name = name;
> +
> + /*
> + * Main MTD routines
> + */
> + stripe->mtd.erase = stripe_erase;
> + stripe->mtd.read = stripe_read;
> + stripe->mtd.write = stripe_write;
> + stripe->mtd.sync = stripe_sync;
> + stripe->mtd.lock = stripe_lock;
> + stripe->mtd.unlock = stripe_unlock;
> + stripe->mtd.suspend = stripe_suspend;
> + stripe->mtd.resume = stripe_resume;
> +
> +#ifdef MTD_PROGRAM_REGIONS
> + /* Montavista patch for Sibley support detected */
> + if((stripe->mtd.flags & MTD_PROGRAM_REGIONS) ||
> (stripe->mtd.flags & MTD_ECC))
> + stripe->mtd.writev = stripe_writev;
> +#else
> + if(stripe->mtd.flags & MTD_ECC)
> + stripe->mtd.writev = stripe_writev;
> +#endif
> +
> + /* not sure about that case. probably should be used not only for
> NAND */
> + if(stripe->mtd.type == MTD_NANDFLASH)
> + stripe->mtd.writev_ecc = stripe_writev_ecc;
> +
> + if(subdev[0]->block_isbad)
> + stripe->mtd.block_isbad = stripe_block_isbad;
> +
> + if(subdev[0]->block_markbad)
> + stripe->mtd.block_markbad = stripe_block_markbad;
> +
> + /* NAND specific */
> + if(stripe->mtd.type == MTD_NANDFLASH)
> + {
> + stripe->mtd.oobblock *= num_devs;
> + stripe->mtd.oobsize *= num_devs;
> + stripe->mtd.oobavail *= num_devs; /* oobavail is to be changed
> later in stripe_merge_oobinfo() */
> + stripe->mtd.eccsize *= num_devs;
> + }
> +
> +#ifdef MTD_PROGRAM_REGIONS
> + /* Montavista patch for Sibley support detected */
> + if(stripe->mtd.flags & MTD_PROGRAM_REGIONS)
> + stripe->mtd.oobblock *= num_devs;
> + else if(stripe->mtd.flags & MTD_ECC)
> + stripe->mtd.eccsize *= num_devs;
> +#else
> + if(stripe->mtd.flags & MTD_ECC)
> + stripe->mtd.eccsize *= num_devs;
> +#endif
> +
> + /* Sort all subdevices by their size (from largest to smallest)*/
> + while(!sort_done)
> + {
> + sort_done = 1;
> + for(i=0; i < num_devs - 1; i++)
> + {
> + struct mtd_info *subdev = stripe->subdev[i];
> + if(subdev->size > stripe->subdev[i+1]->size)
> + {
> + stripe->subdev[i] = stripe->subdev[i+1];
> + stripe->subdev[i+1] = subdev;
> + sort_done = 0;
> + }
> + }
> + }
> +
> + /* Create new device with uniform erase size */
> + curr_erasesize = subdev[0]->erasesize;
> + for (i = 1; i < num_devs; i++)
> + {
> + curr_erasesize = lcm(curr_erasesize, subdev[i]->erasesize);
> + }
> + curr_erasesize *= num_devs;
> +
> + /* Check if there are different size devices in the array*/
> + for (i = 1; i < num_devs; i++)
> + {
> + /* note: subdevices must be already sorted by their size here */
> + if(subdev[i - 1]->size > subdev[i]->size)
> + {
> + u_int32_t tmp_erasesize = subdev[i]->erasesize;
> + for(j = 0; j < i; j++)
> + {
> + tmp_erasesize = lcm(tmp_erasesize,
> subdev[j]->erasesize);
> + }
> + tmp_erasesize *= i;
> + curr_erasesize = lcm(curr_erasesize, tmp_erasesize);
> + }
> + }
> +
> + /* Check if erase size found is valid */
> + if(curr_erasesize <= 0)
> + {
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): Can't find lcm of
> subdevice erase sizes\n");
> + return NULL;
> + }
> +
> + /* Check interleave size validity here */
> + if(curr_erasesize % interleave_size)
> + {
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): Wrong interleave size\n");
> + return NULL;
> + }
> + stripe->interleave_size = interleave_size;
> +
> + stripe->mtd.erasesize = curr_erasesize;
> + stripe->mtd.numeraseregions = 0;
> +
> + /* update (truncate) super device size in accordance with new
> erasesize */
> + stripe->mtd.size = (stripe->mtd.size / stripe->mtd.erasesize) *
> stripe->mtd.erasesize;
> +
> + /* Calculate last data offset for each striped device */
> + for (i = 0; i < num_devs; i++)
> + stripe->subdev_last_offset[i] = last_offset(stripe, i);
> +
> + /* NAND specific */
> + if(stripe->mtd.type == MTD_NANDFLASH)
> + {
> + /* Fill oobavail with correct values here */
> + for (i = 0; i < num_devs; i++)
> + stripe->subdev[i]->oobavail =
> stripe_get_oobavail(stripe->subdev[i]);
> +
> + /* Sets new device oobinfo
> + * NAND flash check is performed inside stripe_merge_oobinfo()
> + * - this should be made after subdevices sorting done for
> proper eccpos and oobfree positioning
> + * NOTE: there are some limitations with different size NAND
> devices striping. all devices must have
> + * the same oobfree and eccpos maps */
> + if(stripe_merge_oobinfo(&stripe->mtd, subdev, num_devs))
> + {
> + kfree(stripe);
> + printk(KERN_ERR "mtd_stripe_create(): oobinfo merge has
> failed\n");
> + return NULL;
> + }
> + }
> +
> + /* Create worker threads */
> + for (i = 0; i < num_devs; i++)
> + {
> + if(stripe_start_write_thread(&stripe->sw_threads[i],
> stripe->subdev[i]) < 0)
> + {
> + kfree(stripe);
> + return NULL;
> + }
> + }
> +
> + return &stripe->mtd;
> +}
> +
>
> +EXPORT_SYMBOL(mtd_stripe_init);
> +EXPORT_SYMBOL(mtd_stripe_exit);
>
Why do you need these functions exported?
> +/*
> + * This is the handler for our kernel parameter, called from
> + * main.c::checksetup(). Note that we can not yet kmalloc() anything,
> + * so we only save the commandline for later processing.
> + *
> + * This function needs to be visible for bootloaders.
>
Can you please elaborate on this?
> +struct mtd_info *mtd_stripe_create(
> + struct mtd_info *subdev[], /* subdevices to stripe */
> + int num_devs, /* number of subdevices */
> + char *name, /* name for the new device */
> + int inteleave_size); /* interleaving size */
> +
> +
> +struct mtd_info *mtd_stripe_create(struct mtd_info *subdev[], /*
> subdevices to stripe */
> + int num_devs, /*
> number of subdevices */
> + char *name, /* name
> for the new device */
> + int interleave_size); /*
> interleaving size (sanity check is required) */
>
Cool, it's an important func, why not declare it twice? ;)
Vitaly
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 7:57 [PATCH/RFC] MTD: Striping layer core Belyakov, Alexander
2006-03-30 9:06 ` Vitaly Wool
@ 2006-03-30 10:35 ` Artem B. Bityutskiy
2006-03-30 15:38 ` Alexander Belyakov
2006-03-30 16:32 ` Nicolas Pitre
2006-03-30 12:11 ` Jörn Engel
2 siblings, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-30 10:35 UTC (permalink / raw)
To: Belyakov, Alexander; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
On Thu, 2006-03-30 at 11:57 +0400, Belyakov, Alexander wrote:
> Hello again!
>
> As it was promised I have split patch with striping and related stuff
> into three parts.
Description is much better, thanks.
> But due to some properties of interleaving algorithm it is very likely
> get increased erase size in case of striping thre ore more devices with
> different size.
Brrr... The resulting eraseblock size is anyway increased. I guess you
wanted to say that one may end up with *substantially* increased
eraseblock size, right?
> 4.3. How to choose interleave size?
> Sub-devices should belong to different (independent) physical flash
> chips in order to get performance increase. Interleave size describes
> striping granularity and it is very important from performance point of
> view. Write operation performance increase should be expected only if
> the amount of data to be written larger than interleave size. For
> example, if we have 512 bytes interleave size, we see no write speed
> boost for files smaller than 512 bytes. File systems have a write buffer
> of well known size (let it be 4096 bytes). Thus it is not good idea to
> set interleave size larger than 2048 byte if we are striping two flash
> chips and going to use the file system on it. For NOR devices the bottom
> border for interleave size is defined by flash buffer size (64 bytes,
> 128 bytes, etc). But such a small values affects read speed on striped
> volumes. Read performance decrease on striped volume is due to large
> number of read sub-operations. Thus, if you are going to stripe N
> devices and launch a file system having write buffer of size B, the
> better choice for interleave size is IS = B / N or somewhat smaller, but
> not smaller than single flash chip buffer size.
> For NAND you should use page size as interleave size value.
Working with flashes, it is very handy to use a notion of the minimal
flash Input/Output unit size. For NOR flashes it is 1 byte (or even 1
bit, but better to think about it as 1 byte). For NAND flashes, this is
one NAND page in current MTD. For ECC-NOR flashes, this is something
like 16 bytes, for data flashes this is another and so on.
Its a pity that MTD Interface is not generic enough to provide this
value, but this is a question of time, I believe somebody will come up
with a patch (may be even you?).
So, to be generic, we have to say that the interleave size has to be
multiple to the minimal flash input/output unit size.
You mentioned file systems. I cannot talk about all of them, but at
least JFFS2's write buffer size is equivalent to the minimal I/O size of
the underlying flash. So, to benefit of your striping layer, filesystems
have to (roughly) multiply the write buffer size on the number of
striped flashes.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 9:06 ` Vitaly Wool
@ 2006-03-30 11:50 ` Artem B. Bityutskiy
2006-03-30 12:15 ` Vitaly Wool
2006-03-30 15:24 ` Alexander Belyakov
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-30 11:50 UTC (permalink / raw)
To: Vitaly Wool
Cc: Belyakov, Alexander, Korolev, Alexey, linux-mtd,
Kutergin, Timofey
On Thu, 2006-03-30 at 13:06 +0400, Vitaly Wool wrote:
> Hi Alexander,
>
> Belyakov, Alexander wrote:
> > One may say that striping is quite similar to already existing in MTD
> > concatenation layer. That is not true since these layers have some sharp
> > distinctions. The first one is the purpose. Concatenation only purpose
> > is to make larger device from several smaller devices. Striping purpose
> > is to make devices operate faster. Next difference is provided access to
> > sub-devices. Concatenation layer provides linear access to sub-devices.
> > Striping provides interleaved access to sub-devices.
> >
> Still it's unclear why not to provide a configurable extension to
> mtdconcat rather than create a new layer.
Well, it is actually quite clear. Yes, in a way this may be considered
as a concatenation, but this is not the purpose of the striping layer.
If you want to concatenate, you don't need all its complexities.
I think that concatenation has to stay concatenation - simple and
straight-forward. It has to do its small task and do it well. No need
to jam tons of the striping code to the tiny'n'shiny concatenation
module.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 7:57 [PATCH/RFC] MTD: Striping layer core Belyakov, Alexander
2006-03-30 9:06 ` Vitaly Wool
2006-03-30 10:35 ` Artem B. Bityutskiy
@ 2006-03-30 12:11 ` Jörn Engel
2006-03-31 6:52 ` Alexander Belyakov
2 siblings, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-30 12:11 UTC (permalink / raw)
To: Belyakov, Alexander; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
On Thu, 30 March 2006 11:57:37 +0400, Belyakov, Alexander wrote:
> diff -uNr a/drivers/mtd/maps/mphysmap.c b/drivers/mtd/maps/mphysmap.c
> --- a/drivers/mtd/maps/mphysmap.c 2006-03-28 12:08:28.000000000
> +0400
> +++ b/drivers/mtd/maps/mphysmap.c 2006-03-28 12:10:48.000000000
> +0400
> @@ -12,6 +12,9 @@
> #ifdef CONFIG_MTD_PARTITIONS
> #include <linux/mtd/partitions.h>
> #endif
> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
> +#include <linux/mtd/stripe.h>
> +#endif
Move the #ifdef into the header.
> static struct map_info mphysmap_static_maps[] = {
> #if CONFIG_MTD_MULTI_PHYSMAP_1_WIDTH
> @@ -155,6 +158,15 @@
> };
> };
> up(&map_mutex);
> +
> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
> +#ifndef MODULE
> + if(mtd_stripe_init()) {
> + printk(KERN_WARNING "MTD stripe initialization from cmdline
> has failed\n");
> + }
> +#endif
> +#endif
> +
> return 0;
> }
o Lindent.
o Dokumentation/CodingStyle
o remove #ifdefs
Your code suffers from a lot of very basic things that make it hard to
review. In its current state, you get a clean NACK from me. Whether
the design makes sense, I didn't even look at. You seem to receive
enough feedback there already.
Jörn
--
You cannot suppose that Moliere ever troubled himself to be original in the
matter of ideas. You cannot suppose that the stories he tells in his plays
have never been told before. They were culled, as you very well know.
-- Andre-Louis Moreau in Scarabouche
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 11:50 ` Artem B. Bityutskiy
@ 2006-03-30 12:15 ` Vitaly Wool
0 siblings, 0 replies; 65+ messages in thread
From: Vitaly Wool @ 2006-03-30 12:15 UTC (permalink / raw)
To: dedekind; +Cc: Belyakov, Alexander, Korolev, Alexey, linux-mtd,
Kutergin, Timofey
Artem B. Bityutskiy wrote:
>> Still it's unclear why not to provide a configurable extension to
>> mtdconcat rather than create a new layer.
>>
>
> Well, it is actually quite clear. Yes, in a way this may be considered
> as a concatenation, but this is not the purpose of the striping layer.
> If you want to concatenate, you don't need all its complexities.
>
> I think that concatenation has to stay concatenation - simple and
> straight-forward. It has to do its small task and do it well. No need
> to jam tons of the striping code to the tiny'n'shiny concatenation
> module.
>
>
Yep, probably you both are right.
Vitaly
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 9:06 ` Vitaly Wool
2006-03-30 11:50 ` Artem B. Bityutskiy
@ 2006-03-30 15:24 ` Alexander Belyakov
2006-03-30 15:39 ` Artem B. Bityutskiy
1 sibling, 1 reply; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-30 15:24 UTC (permalink / raw)
To: Vitaly Wool; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
Hi Vitaly,
Vitaly Wool wrote:
> Still it's unclear why not to provide a configurable extension to
> mtdconcat rather than create a new layer.
Striping and concatenation have different purposes. And extension in
that case will become significantly more complicated than original layer.
> Sooo many threads... :(
One per sub-device.
>>
>> 3. POSSIBLE CONFIGURATIONS AND LIMITATIONS
>> It is possible to stripe devices of the same type. We can't stripe NOR
>> and NAND, but only NOR and NOR or NAND and NAND. Flashes of the same
>> type can differ in erase size and total size.
> Why is that? Being able to deal only with flash chips of the same
> type, your approach has very limited applicability
Please explain how is it possible to stripe NOR device with NAND? And
what are you expecting from such an action?
>> };
>> };
>> up(&map_mutex);
>> +
>> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
>> +#ifndef MODULE
>> + if(mtd_stripe_init()) {
>> + printk(KERN_WARNING "MTD stripe initialization from cmdline
>> has failed\n");
>> + }
>> +#endif
>> +#endif
>>
> @@ -155,6 +158,15 @@
> Bah, what's going on here?
I should remove #ifndef MODULE from here, and from mphysmap_exit() too.
Thanks.
>> +/* Operation codes */
>> +#define MTD_STRIPE_OPCODE_READ 0x1
>> +#define MTD_STRIPE_OPCODE_WRITE 0x2
>> +#define MTD_STRIPE_OPCODE_READ_ECC 0x3
>> +#define MTD_STRIPE_OPCODE_WRITE_ECC 0x4
>> +#define MTD_STRIPE_OPCODE_WRITE_OOB 0x5
>> +#define MTD_STRIPE_OPCODE_ERASE 0x6
>>
> You don't need READ_OOB, eh?
I do not use READ_OOB operation code here. In current implementation
read_oob is being done from context of the caller thread and we do not
push that operation into worker threads queues.
>> +/*
>> + * Miscelaneus support routines
>> + */
> Aint this one and stuff alike gonna be static?
True. I'll do that.
>> + for(i = 1; i < num_devs; i++)
>> + {
>> + if(mtd->oobinfo.useecc != subdev[i]->oobinfo.useecc ||
>> + mtd->oobinfo.eccbytes != subdev[i]->oobinfo.eccbytes)
>> + {
>> + printk(KERN_ERR "stripe_merge_oobinfo(): oobinfo parameters
>> is not compatible for all subdevices\n");
>> + return -EINVAL;
>> + }
>> + }
>>
> I guess this is a limitation that is not mentioned anywhere.
While striping NAND pages become larger for striped device. Again we
have virtual "merging" but somewhat complicated than "merging" applied
to erase blocks. NAND devices to be striped must have the same
characteristics in the current implementation. It is strong limitation
and probably can be toned down in something. But only the most common
usage case for NAND (the identical chips) is considered in presented
patch. I missed that important point in documentation, sorry.
>> +EXPORT_SYMBOL(mtd_stripe_init);
>> +EXPORT_SYMBOL(mtd_stripe_exit);
>>
> Why do you need these functions exported?
At the moment these functions are not supposed to be used by others if
mtdstripe.ko is a standalone module. Actually we do not need them
exported. I'll remove that exports.
>> +/* + * This is the handler for our kernel parameter, called from + *
>> main.c::checksetup(). Note that we can not yet kmalloc() anything,
>> + * so we only save the commandline for later processing.
>> + *
>> + * This function needs to be visible for bootloaders.
>>
> Can you please elaborate on this?
That comment about bootloaders is not supposed to be here. I'll remove
it. The code below that comment just stores part of kernel configuration
string for later processing.
> Cool, it's an important func, why not declare it twice? ;)
>
It is typo, sorry. I'll remove second declaration.
Thanks,
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 10:35 ` Artem B. Bityutskiy
@ 2006-03-30 15:38 ` Alexander Belyakov
2006-03-30 16:32 ` Nicolas Pitre
1 sibling, 0 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-30 15:38 UTC (permalink / raw)
To: dedekind; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
Hi Artem,
Artem B. Bityutskiy wrote:
>> But due to some properties of interleaving algorithm it is very likely
>> get increased erase size in case of striping thre ore more devices with
>> different size.
> Brrr... The resulting eraseblock size is anyway increased. I guess you
> wanted to say that one may end up with *substantially* increased
> eraseblock size, right?
Yes, you're right.
> So, to be generic, we have to say that the interleave size has to be
> multiple to the minimal flash input/output unit size.
Thanks for your clarification!
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 15:24 ` Alexander Belyakov
@ 2006-03-30 15:39 ` Artem B. Bityutskiy
2006-03-31 7:06 ` Alexander Belyakov
0 siblings, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-30 15:39 UTC (permalink / raw)
To: Alexander Belyakov
Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
On Thu, 2006-03-30 at 19:24 +0400, Alexander Belyakov wrote:
> Please explain how is it possible to stripe NOR device with NAND? And
> what are you expecting from such an action?
Well, probably this is a perversion and is not needed in reality, but
still. I conceive it like this. Yo have 2 flashes. You as usually,
calculate the resulting eraseblock size. You see at the minimal I/O unit
size of both flashes and similarly calculate the resulting minimal I/O
size. So that's it. You'll end up with a though perverted, but still a
striped MTD device.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 10:35 ` Artem B. Bityutskiy
2006-03-30 15:38 ` Alexander Belyakov
@ 2006-03-30 16:32 ` Nicolas Pitre
2006-03-30 16:38 ` Artem B. Bityutskiy
2006-03-31 7:19 ` Alexander Belyakov
1 sibling, 2 replies; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-30 16:32 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Belyakov, Alexander, Korolev, Alexey, linux-mtd,
Kutergin, Timofey
On Thu, 30 Mar 2006, Artem B. Bityutskiy wrote:
> Working with flashes, it is very handy to use a notion of the minimal
> flash Input/Output unit size. For NOR flashes it is 1 byte (or even 1
> bit, but better to think about it as 1 byte). For NAND flashes, this is
> one NAND page in current MTD. For ECC-NOR flashes, this is something
> like 16 bytes, for data flashes this is another and so on.
While NOR flash can indeed write one byte (or even one bit) at a time,
it is not really useful, not in the context of stripe at least. The NOR
write buffer size is a much more useful metric.
So while the minimum write size is of course a required parameter (that
would clean up many *_init functions in jffs2/wbuf.c), it is also
necessary to consider the "optimal" write size, being the write buffer
size in the case of NOR flash. Larger writes can be performed as long
as they are multiples of the minimum write size.
Read size is obviously irrelevant as any size can be accommodated by the
driver.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 16:32 ` Nicolas Pitre
@ 2006-03-30 16:38 ` Artem B. Bityutskiy
2006-03-30 16:56 ` Jared Hulbert
2006-03-31 7:19 ` Alexander Belyakov
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-30 16:38 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Belyakov, Alexander, Korolev, Alexey, linux-mtd,
Kutergin, Timofey
On Thu, 2006-03-30 at 11:32 -0500, Nicolas Pitre wrote:
> While NOR flash can indeed write one byte (or even one bit) at a time,
> it is not really useful, not in the context of stripe at least. The NOR
> write buffer size is a much more useful metric.
Err, what do you mean by "the NOR write buffer size"?
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 16:38 ` Artem B. Bityutskiy
@ 2006-03-30 16:56 ` Jared Hulbert
2006-03-30 17:03 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Jared Hulbert @ 2006-03-30 16:56 UTC (permalink / raw)
To: dedekind
Cc: Belyakov, Alexander, Korolev, Alexey, Nicolas Pitre,
Kutergin, Timofey, linux-mtd
> Err, what do you mean by "the NOR write buffer size"?
Most high performance NOR today have a buffered write mode to get
decent write speed. This requires you write a bunch of words to the
chips write buffer and then tell it to program. See
cfi_ident.MaxBufWriteSize in cfi.h.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 16:56 ` Jared Hulbert
@ 2006-03-30 17:03 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-30 17:03 UTC (permalink / raw)
To: Jared Hulbert
Cc: Belyakov, Alexander, Korolev, Alexey, Nicolas Pitre,
Kutergin, Timofey, linux-mtd
On Thu, 2006-03-30 at 08:56 -0800, Jared Hulbert wrote:
> Most high performance NOR today have a buffered write mode to get
> decent write speed. This requires you write a bunch of words to the
> chips write buffer and then tell it to program. See
> cfi_ident.MaxBufWriteSize in cfi.h.
Ah, ok, thanks. Yeah, id does matter. The question though is how to
represent this in a generic way...
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 12:11 ` Jörn Engel
@ 2006-03-31 6:52 ` Alexander Belyakov
2006-03-31 7:57 ` Artem B. Bityutskiy
2006-03-31 8:47 ` Jörn Engel
0 siblings, 2 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 6:52 UTC (permalink / raw)
To: Jörn Engel; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
Jörn Engel wrote:
> On Thu, 30 March 2006 11:57:37 +0400, Belyakov, Alexander wrote:
>> diff -uNr a/drivers/mtd/maps/mphysmap.c b/drivers/mtd/maps/mphysmap.c
>> --- a/drivers/mtd/maps/mphysmap.c 2006-03-28 12:08:28.000000000
>> +0400
>> +++ b/drivers/mtd/maps/mphysmap.c 2006-03-28 12:10:48.000000000
>> +0400
>> @@ -12,6 +12,9 @@
>> #ifdef CONFIG_MTD_PARTITIONS
>> #include <linux/mtd/partitions.h>
>> #endif
>> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
>> +#include <linux/mtd/stripe.h>
>> +#endif
>
> Move the #ifdef into the header.
May I ask what is the reason for that (taking into account that
partitions.h is already included here the same way for the same purpose)?
> Your code suffers from a lot of very basic things that make it hard to
> review.
What exactly you wish to see here? Hope I can bring clarifications by
changing code/comments or via this mailing list.
Thanks,
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 15:39 ` Artem B. Bityutskiy
@ 2006-03-31 7:06 ` Alexander Belyakov
2006-03-31 8:02 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 7:06 UTC (permalink / raw)
To: dedekind; +Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
Artem B. Bityutskiy wrote:
> Well, probably this is a perversion and is not needed in reality, but
> still. I conceive it like this. Yo have 2 flashes. You as usually,
> calculate the resulting eraseblock size. You see at the minimal I/O unit
> size of both flashes and similarly calculate the resulting minimal I/O
> size. So that's it. You'll end up with a though perverted, but still a
> striped MTD device.
First problem in case of striping NOR and NAND is a question about type
of striped device. Should we report it as NOR or as NAND. I believe it
is important for clients to know about that. Imagine, for example,
device reported as NAND behaves as NOR or vice versa. Another problem is
a difference in operation speed. Apparently you won't get any
performance gain. These are only top of iceberg. Note that even plain
and simple mtdconcat is not supposed to work with flashes of different
types.
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-30 16:32 ` Nicolas Pitre
2006-03-30 16:38 ` Artem B. Bityutskiy
@ 2006-03-31 7:19 ` Alexander Belyakov
1 sibling, 0 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 7:19 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
Nicolas Pitre wrote:
> While NOR flash can indeed write one byte (or even one bit) at a time,
> it is not really useful, not in the context of stripe at least. The NOR
> write buffer size is a much more useful metric.
That was mentioned in documentation for striping layer core. "For NOR
devices the bottom border for interleave size is defined by flash buffer
size". But anyway I think it is worth to mention that interleave size in
general case can be less than buffer size.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 6:52 ` Alexander Belyakov
@ 2006-03-31 7:57 ` Artem B. Bityutskiy
2006-03-31 8:11 ` Alexander Belyakov
2006-03-31 8:47 ` Jörn Engel
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 7:57 UTC (permalink / raw)
To: Alexander Belyakov; +Cc: Korolev, Alexey, Kutergin, Timofey, linux-mtd
Alexander Belyakov wrote:
>>> @@ -12,6 +12,9 @@
>>> #ifdef CONFIG_MTD_PARTITIONS
>>> #include <linux/mtd/partitions.h>
>>> #endif
>>> +#ifdef CONFIG_MTD_CMDLINE_STRIPE
>>> +#include <linux/mtd/stripe.h>
>>> +#endif
>>
>> Move the #ifdef into the header.
>
> May I ask what is the reason for that (taking into account that
> partitions.h is already included here the same way for the same purpose)?
Just in general, the
#ifdef EPRST
#include <elki-palki.h>
#endif
is not very nice. It is nicer to add the #ifdef protection in the header
itself. This is not a major problem, but it is just nicer and Linux
people like this. Just incapsulate all the protection in the header
itself. Don't spread the protection over many .c files which may
potentially want to include your header.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 7:06 ` Alexander Belyakov
@ 2006-03-31 8:02 ` Artem B. Bityutskiy
2006-03-31 8:05 ` Artem B. Bityutskiy
2006-03-31 16:49 ` Nicolas Pitre
0 siblings, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 8:02 UTC (permalink / raw)
To: Alexander Belyakov
Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
Alexander Belyakov wrote:
>> Well, probably this is a perversion and is not needed in reality, but
>> still. I conceive it like this. Yo have 2 flashes. You as usually,
>> calculate the resulting eraseblock size. You see at the minimal I/O unit
>> size of both flashes and similarly calculate the resulting minimal I/O
>> size. So that's it. You'll end up with a though perverted, but still a
>> striped MTD device.
>
> First problem in case of striping NOR and NAND is a question about type
> of striped device. Should we report it as NOR or as NAND. I believe it
> is important for clients to know about that. Imagine, for example,
> device reported as NAND behaves as NOR or vice versa. Another problem is
> a difference in operation speed. Apparently you won't get any
> performance gain. These are only top of iceberg. Note that even plain
> and simple mtdconcat is not supposed to work with flashes of different
> types.
Good question. I think you could report this is a striped device
(introducing an MTD_STRIPED option). Also you may provide a
stripe_get_info(struct mtd_info *mtd) function which will return a
struct stripe_info object describing this striped device, including the
components it consists of.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:02 ` Artem B. Bityutskiy
@ 2006-03-31 8:05 ` Artem B. Bityutskiy
2006-03-31 8:17 ` Alexander Belyakov
2006-03-31 9:27 ` Jörn Engel
2006-03-31 16:49 ` Nicolas Pitre
1 sibling, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 8:05 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
Artem B. Bityutskiy wrote:
> Good question. I think you could report this is a striped device
> (introducing an MTD_STRIPED option). Also you may provide a
> stripe_get_info(struct mtd_info *mtd) function which will return a
> struct stripe_info object describing this striped device, including the
> components it consists of.
Err, and I believe you *have to* report mtd type as MTD_STRIPED.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 7:57 ` Artem B. Bityutskiy
@ 2006-03-31 8:11 ` Alexander Belyakov
2006-03-31 8:31 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 8:11 UTC (permalink / raw)
To: Artem B. Bityutskiy; +Cc: Korolev, Alexey, Kutergin, Timofey, linux-mtd
Artem B. Bityutskiy wrote:
> Just in general, the
>
> #ifdef EPRST
> #include <elki-palki.h>
> #endif
>
> is not very nice. It is nicer to add the #ifdef protection in the
> header itself. This is not a major problem, but it is just nicer and
> Linux people like this. Just incapsulate all the protection in the
> header itself. Don't spread the protection over many .c files which
> may potentially want to include your header.
>
I followed the already existing example for partitions.h in the same
mphysmap.c file. Just thought it's OK. I'll try to fix that, thanks.
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:05 ` Artem B. Bityutskiy
@ 2006-03-31 8:17 ` Alexander Belyakov
2006-03-31 8:38 ` Artem B. Bityutskiy
2006-03-31 8:55 ` Artem B. Bityutskiy
2006-03-31 9:27 ` Jörn Engel
1 sibling, 2 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 8:17 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
Artem B. Bityutskiy wrote:
> Artem B. Bityutskiy wrote:
>> Good question. I think you could report this is a striped device
>> (introducing an MTD_STRIPED option). Also you may provide a
>> stripe_get_info(struct mtd_info *mtd) function which will return a
>> struct stripe_info object describing this striped device, including
>> the components it consists of.
> Err, and I believe you *have to* report mtd type as MTD_STRIPED.
>
In that case clients should be aware of using striped devices and
provide special support for them. The one of the ideas of the suggested
solution is to hide striping internals from the client providing generic
mtd device (just with somewhat increased performance).
Alexander Belyakov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:11 ` Alexander Belyakov
@ 2006-03-31 8:31 ` Artem B. Bityutskiy
2006-03-31 8:35 ` Alexander Belyakov
0 siblings, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 8:31 UTC (permalink / raw)
To: Alexander Belyakov; +Cc: Korolev, Alexey, Kutergin, Timofey, linux-mtd
Alexander Belyakov wrote:
> I followed the already existing example for partitions.h in the same
> mphysmap.c file. Just thought it's OK. I'll try to fix that, thanks.
You know, Linux is eveloping, and older code is not always the best
coding example. If you read LKML, you should have noticed how strict
people are, a lot of those patches which would had been accepted in the
past are blamed and rejected now. And this is normal IMO, people study
on their own mistakes.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:31 ` Artem B. Bityutskiy
@ 2006-03-31 8:35 ` Alexander Belyakov
0 siblings, 0 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-03-31 8:35 UTC (permalink / raw)
To: Artem B. Bityutskiy; +Cc: Korolev, Alexey, Kutergin, Timofey, linux-mtd
Artem B. Bityutskiy wrote:
> Alexander Belyakov wrote:
>> I followed the already existing example for partitions.h in the same
>> mphysmap.c file. Just thought it's OK. I'll try to fix that, thanks.
>
> You know, Linux is eveloping, and older code is not always the best
> coding example. If you read LKML, you should have noticed how strict
> people are, a lot of those patches which would had been accepted in
> the past are blamed and rejected now. And this is normal IMO, people
> study on their own mistakes.
>
That's OK. I see no problems here. I'll fix that.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:17 ` Alexander Belyakov
@ 2006-03-31 8:38 ` Artem B. Bityutskiy
2006-03-31 8:55 ` Artem B. Bityutskiy
1 sibling, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 8:38 UTC (permalink / raw)
To: Alexander Belyakov
Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
Alexander Belyakov wrote:
> In that case clients should be aware of using striped devices and
> provide special support for them. The one of the ideas of the suggested
> solution is to hide striping internals from the client providing generic
> mtd device (just with somewhat increased performance).
False.
People don't have to look at mtd->type and may be happy. But if they do
want to do some mtd type-specific things, they do look at mtd type,
recognize what is this flash, and do the flash-specific things.
This does not work in case of NAND at the moment. Indeed, users have to
use weird mtd->read_ecc() instead of just mtd->read(), etc. But this is
long agreed as bad interface and will be fixed some time later. In future,
I believe, we'll have a common generic flash model, and a generic MTD
interface. And everyone will be able to work with all flash types in the
same generic way. Although if users will want to do some flash-specific
things, they will look at mtd->type, and have a big switch doing
whatever flash-specific actions are wanted.
MTD is currently rather far from this nice picture, but this is a
question of time I believe.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 6:52 ` Alexander Belyakov
2006-03-31 7:57 ` Artem B. Bityutskiy
@ 2006-03-31 8:47 ` Jörn Engel
1 sibling, 0 replies; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 8:47 UTC (permalink / raw)
To: Alexander Belyakov; +Cc: Korolev, Alexey, linux-mtd, Kutergin, Timofey
On Fri, 31 March 2006 10:52:59 +0400, Alexander Belyakov wrote:
>
> >Your code suffers from a lot of very basic things that make it hard to
> >review.
>
> What exactly you wish to see here? Hope I can bring clarifications by
> changing code/comments or via this mailing list.
To some degree, I leave it to you to take a look at your code and
improve it. There is little point in me pointing out every detail.
In that case I could just write the code myself.
As a starting point, you should take a close look at
Dokumentation/CodingStyle and will notice that your code has a rather
different style. Worse, your style is not even consistent within
itself. People will be more inclined to take a closer look if the
code looks similar to what they would expect elsewhere.
Then there's the general rule to make code simple to read. #ifdef is
a concept that makes code complicated. Avoiding it where possible is
a good idea. Having the #ifdef once in a header is better than having
it once in each source file including the header. Having pretty much
anything once is better than having it in many places.
When you're done with this, please come back. I'll take a closer look
then.
Jörn
--
Public Domain - Free as in Beer
General Public - Free as in Speech
BSD License - Free as in Enterprise
Shared Source - Free as in "Work will make you..."
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:17 ` Alexander Belyakov
2006-03-31 8:38 ` Artem B. Bityutskiy
@ 2006-03-31 8:55 ` Artem B. Bityutskiy
2006-03-31 16:59 ` Nicolas Pitre
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 8:55 UTC (permalink / raw)
To: Alexander Belyakov
Cc: Korolev, Alexey, Vitaly Wool, Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 12:17 +0400, Alexander Belyakov wrote:
> In that case clients should be aware of using striped devices and
> provide special support for them. The one of the ideas of the suggested
> solution is to hide striping internals from the client providing generic
> mtd device (just with somewhat increased performance).
I think you are one more guy who hit on the not generic enough
interface.
Indeed, on the one hand, you want to inform the type of flash used. If
this is NAND, people have to use NAND-specific stuff like
mtd->write_ecc. If this is NOR, they have to use NOR-related stuff like
mtd->write, mtd->point and so on.
On the other hand, you still want to inform users that this is a striped
MTD device. They may want to know this.
And the best thing you can do is not to adopt yourself to far too old
MTD interface, which is like this for historical reasons, but to make a
revolution in the MTD interface itself.
By revolution I mean:
1. To invent a common generic flash model.
2. To make MTD interface generic.
3. To fix existing users. Well, I would say fixing JFFS2 may be enough.
This is a big piece of work, but it is what I consider a professional
approach.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:05 ` Artem B. Bityutskiy
2006-03-31 8:17 ` Alexander Belyakov
@ 2006-03-31 9:27 ` Jörn Engel
2006-03-31 9:36 ` Artem B. Bityutskiy
1 sibling, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 9:27 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 12:05:48 +0400, Artem B. Bityutskiy wrote:
> Artem B. Bityutskiy wrote:
> >Good question. I think you could report this is a striped device
> >(introducing an MTD_STRIPED option). Also you may provide a
> >stripe_get_info(struct mtd_info *mtd) function which will return a
> >struct stripe_info object describing this striped device, including the
> >components it consists of.
> Err, and I believe you *have to* report mtd type as MTD_STRIPED.
URGH!
The current mess that makes up mtd->type and mtd->flags needs to be
sanitized anyway. Instead of being MTD_NAND, MTD_NOR or MTD_STRIPED,
it should tell the user _how_ to treat the device, not _what_ it is.
Basically, whereever a user (jffs2 basically) has
if (mtd->type == MTD_FOO)
setup_bar;
it should actually do
if (mtd->flags == MTD_NEEDS_BAR)
setup_bar;
Quite likely there won't be many flags left after the cleanup is done.
Most of them should simple be erase_size and write_size (page_size or
ecc_size currently), not flags.
Jörn
--
You can't tell where a program is going to spend its time. Bottlenecks
occur in surprising places, so don't try to second guess and put in a
speed hack until you've proven that's where the bottleneck is.
-- Rob Pike
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 9:27 ` Jörn Engel
@ 2006-03-31 9:36 ` Artem B. Bityutskiy
2006-03-31 9:40 ` Jörn Engel
0 siblings, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 9:36 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
Jörn Engel wrote:
> The current mess that makes up mtd->type and mtd->flags needs to be
> sanitized anyway. Instead of being MTD_NAND, MTD_NOR or MTD_STRIPED,
> it should tell the user _how_ to treat the device, not _what_ it is.
> Basically, whereever a user (jffs2 basically) has
> if (mtd->type == MTD_FOO)
> setup_bar;
> it should actually do
> if (mtd->flags == MTD_NEEDS_BAR)
> setup_bar;
>
> Quite likely there won't be many flags left after the cleanup is done.
> Most of them should simple be erase_size and write_size (page_size or
> ecc_size currently), not flags.
No, mtd->type has to tell you the type of the MTD device. Ideally, this
has to me the only flash-specific field in the mtd_info structure. And
if users want to do some flash specific things, they have to look at
mtd->type, realize what is the subsystem which handles this flash, and
start working with this subsystem. For striping, this is the striping
subsystem. I don't know what for mtd->flags, probably this hast to go at
all.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 9:36 ` Artem B. Bityutskiy
@ 2006-03-31 9:40 ` Jörn Engel
2006-03-31 10:00 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 9:40 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 13:36:43 +0400, Artem B. Bityutskiy wrote:
>
> No, mtd->type has to tell you the type of the MTD device. Ideally, this
> has to me the only flash-specific field in the mtd_info structure. And
> if users want to do some flash specific things, they have to look at
> mtd->type, realize what is the subsystem which handles this flash, and
> start working with this subsystem. For striping, this is the striping
> subsystem. I don't know what for mtd->flags, probably this hast to go at
> all.
Is this exported to userspace via mtdchar?
Jörn
--
Das Aufregende am Schreiben ist es, eine Ordnung zu schaffen, wo
vorher keine existiert hat.
-- Doris Lessing
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 9:40 ` Jörn Engel
@ 2006-03-31 10:00 ` Artem B. Bityutskiy
2006-03-31 10:06 ` Artem B. Bityutskiy
2006-03-31 10:07 ` Jörn Engel
0 siblings, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 10:00 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 11:40 +0200, Jörn Engel wrote:
> On Fri, 31 March 2006 13:36:43 +0400, Artem B. Bityutskiy wrote:
> >
> > No, mtd->type has to tell you the type of the MTD device. Ideally, this
> > has to me the only flash-specific field in the mtd_info structure. And
> > if users want to do some flash specific things, they have to look at
> > mtd->type, realize what is the subsystem which handles this flash, and
> > start working with this subsystem. For striping, this is the striping
> > subsystem. I don't know what for mtd->flags, probably this hast to go at
> > all.
>
> Is this exported to userspace via mtdchar?
>
I don't quite understand what is "this".
Ideally yes, there should be a /ubi/devices/mtd/mtdX/type file, there
you can look and realize the type of this MTD device. The contents of
this file will be generated using the mtd-type field. Userspace will be
able to realize the type of this device and look to the corresponding
place in sysfs. For example, if this is a striped device, it'll look
at /sys/devices/mtd_stripe/stripe0/ or whatever.
This is just my imagination how things should look like. So, treat this
correspondingly. May be it is better to use symlinks in sysfs to point
to the striping layer, no sure.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 10:00 ` Artem B. Bityutskiy
@ 2006-03-31 10:06 ` Artem B. Bityutskiy
2006-03-31 10:07 ` Jörn Engel
1 sibling, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 10:06 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 14:00 +0400, Artem B. Bityutskiy wrote:
> Ideally yes, there should be a /ubi/devices/mtd/mtdX/type file, there
Pardon, typo, I meant /sys/devices/mtd/mtdX/type.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 10:00 ` Artem B. Bityutskiy
2006-03-31 10:06 ` Artem B. Bityutskiy
@ 2006-03-31 10:07 ` Jörn Engel
2006-03-31 10:18 ` Artem B. Bityutskiy
2006-03-31 17:06 ` Nicolas Pitre
1 sibling, 2 replies; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 10:07 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 14:00:09 +0400, Artem B. Bityutskiy wrote:
> On Fri, 2006-03-31 at 11:40 +0200, Jörn Engel wrote:
> > On Fri, 31 March 2006 13:36:43 +0400, Artem B. Bityutskiy wrote:
> > >
> > > No, mtd->type has to tell you the type of the MTD device. Ideally, this
> > > has to me the only flash-specific field in the mtd_info structure. And
> > > if users want to do some flash specific things, they have to look at
> > > mtd->type, realize what is the subsystem which handles this flash, and
> > > start working with this subsystem. For striping, this is the striping
> > > subsystem. I don't know what for mtd->flags, probably this hast to go at
> > > all.
> >
> > Is this exported to userspace via mtdchar?
> >
> I don't quite understand what is "this".
>
> Ideally yes...
I take it that mtd->type is not exported to userspace yet. Which is
good, because imho it shouldn't be. Flash should me a step towards a
standard interface, something that hard disk have had for years.
Knowing every little detail about every single flash chip bears more
harm than good.
And as far as in-kernel users are concerned, JFFS2 is the only one
that really matters. For JFFS2, my previous statement still stand
that it should base decisions on the existence of a feature, not a
type.
You could still have flash types, but merely as aggregates of
features, nothing else. For example, NAND flash could have
FEATURE_OOB, FEATURE_WBUF, FEATURE_ECC etc. Which of these feature
make sense and are really needed is another matter, but you should get
the idea.
Jörn
--
/* Keep these two variables together */
int bar;
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 10:07 ` Jörn Engel
@ 2006-03-31 10:18 ` Artem B. Bityutskiy
2006-03-31 11:40 ` Jörn Engel
2006-03-31 17:06 ` Nicolas Pitre
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 10:18 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 12:07 +0200, Jörn Engel wrote:
> I take it that mtd->type is not exported to userspace yet. Which is
> good, because imho it shouldn't be. Flash should me a step towards a
> standard interface, something that hard disk have had for years.
> Knowing every little detail about every single flash chip bears more
> harm than good.
Again, imagine mtd->type is the only flash-specific field in mtd_info.
You work with any flash the same way. But if you still want to do some
flash-specific things, you look at mtd->type, determine which kind of
flash is this, then work with the flash-specific subsystem.
> And as far as in-kernel users are concerned, JFFS2 is the only one
> that really matters. For JFFS2, my previous statement still stand
> that it should base decisions on the existence of a feature, not a
> type.
It looks at mtd->type, finds out the type, and it knows what features
must be there. No flash-specific features are available in mtd_info. It
only contains generic stuff. For example, mtd_info should not even
contain the OOB-handling stuff. If you want to work with OOB then your
application is NAND-bound. Then you look at mtd->type, see this is NAND,
and happily start working with NAND subsystem of MTD.
Note, what I'm saying is far from the current things, again, I'm saying
how things should look like in my oppinion.
> You could still have flash types, but merely as aggregates of
> features, nothing else. For example, NAND flash could have
> FEATURE_OOB, FEATURE_WBUF, FEATURE_ECC etc. Which of these feature
> make sense and are really needed is another matter, but you should get
> the idea.
Decomposition on features is not very nice IMO. It is better to specify
type. Each type has its own set of features. It is too difficult to
recognize all features, classify them nicely and put into mtd_info.
Features go and come, the change.
There may be some feature-bound decomposition, but in parallel to the
decomposition by type, which is the main and more natural decomposition.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 10:18 ` Artem B. Bityutskiy
@ 2006-03-31 11:40 ` Jörn Engel
2006-03-31 11:47 ` Artem B. Bityutskiy
2006-03-31 11:55 ` Artem B. Bityutskiy
0 siblings, 2 replies; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 11:40 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 14:18:44 +0400, Artem B. Bityutskiy wrote:
>
> It looks at mtd->type, finds out the type, and it knows what features
> must be there. No flash-specific features are available in mtd_info. It
> only contains generic stuff. For example, mtd_info should not even
> contain the OOB-handling stuff. If you want to work with OOB then your
> application is NAND-bound. Then you look at mtd->type, see this is NAND,
> and happily start working with NAND subsystem of MTD.
>
> Note, what I'm saying is far from the current things, again, I'm saying
> how things should look like in my oppinion.
Right now, we have code like
#define jffs2_can_mark_obsolete(c) ((c->mtd->type == MTD_NORFLASH && !(c->mtd->flags & MTD_ECC)) || c->mtd->type == MTD_RAM)
I guess we can both agree that it is far from optimal. And my take
would be to replace it with
#define jffs2_can_mark_obsolete(c) (c->mtd->flags & MTD_CAN_MARK_OBSOLETE)
Note that mtd->type has gone.
Jörn
--
A surrounded army must be given a way out.
-- Sun Tzu
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:40 ` Jörn Engel
@ 2006-03-31 11:47 ` Artem B. Bityutskiy
2006-03-31 11:56 ` Jörn Engel
2006-03-31 11:55 ` Artem B. Bityutskiy
1 sibling, 1 reply; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 11:47 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
Jörn Engel wrote:
> Right now, we have code like
>
> #define jffs2_can_mark_obsolete(c) ((c->mtd->type == MTD_NORFLASH && !(c->mtd->flags & MTD_ECC)) || c->mtd->type == MTD_RAM)
>
> I guess we can both agree that it is far from optimal. And my take
> would be to replace it with
>
> #define jffs2_can_mark_obsolete(c) (c->mtd->flags & MTD_CAN_MARK_OBSOLETE)
This piece of code indeed looks better. But nevertheless, MTD does not
have to please JFFS2. It is difficult to foresee all "features" MTD
users can conceive and add corresponding flags. Tomorrow a "XYZ" file
system will apear and will want a "can_do_my_crap" feature, and you'll
have to add this to mtd_info. This is really a bad idea.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:40 ` Jörn Engel
2006-03-31 11:47 ` Artem B. Bityutskiy
@ 2006-03-31 11:55 ` Artem B. Bityutskiy
2006-03-31 11:59 ` Jörn Engel
2006-03-31 17:14 ` Nicolas Pitre
1 sibling, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 11:55 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 13:40 +0200, Jörn Engel wrote:
> Right now, we have code like
>
> #define jffs2_can_mark_obsolete(c) ((c->mtd->type == MTD_NORFLASH && !(c->mtd->flags & MTD_ECC)) || c->mtd->type == MTD_RAM)
>
> I guess we can both agree that it is far from optimal. And my take
> would be to replace it with
>
> #define jffs2_can_mark_obsolete(c) (c->mtd->flags & MTD_CAN_MARK_OBSOLETE)
>
Add an c->flags field to the per-JFFS2 structure. Initialize it on mount
properly. Then
#define jffs2_can_mark_obsolete(c) (c->flags & JFFS2_CAN_MARK_OBSOLETE)
Don't try to jam this to MTD please.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:47 ` Artem B. Bityutskiy
@ 2006-03-31 11:56 ` Jörn Engel
2006-03-31 12:06 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 11:56 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 15:47:10 +0400, Artem B. Bityutskiy wrote:
>
> This piece of code indeed looks better. But nevertheless, MTD does not
> have to please JFFS2. It is difficult to foresee all "features" MTD
> users can conceive and add corresponding flags. Tomorrow a "XYZ" file
> system will apear and will want a "can_do_my_crap" feature, and you'll
> have to add this to mtd_info. This is really a bad idea.
Why? It is not as if we couldn't count all the features that are
currently used and make sense. And in the future, the number of sane
features will still be fairly low. All we have to do is tell people
trying to add "can_do_my_crap" that we don't want such crap in the
kernel.
Note the "and make sense" part in the second sentence. ;)
Jörn
--
The grand essentials of happiness are: something to do, something to
love, and something to hope for.
-- Allan K. Chalmers
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:55 ` Artem B. Bityutskiy
@ 2006-03-31 11:59 ` Jörn Engel
2006-03-31 12:11 ` Artem B. Bityutskiy
2006-03-31 17:14 ` Nicolas Pitre
1 sibling, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 11:59 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 15:55:10 +0400, Artem B. Bityutskiy wrote:
> >
> Add an c->flags field to the per-JFFS2 structure. Initialize it on mount
> properly. Then
>
> #define jffs2_can_mark_obsolete(c) (c->flags & JFFS2_CAN_MARK_OBSOLETE)
>
> Don't try to jam this to MTD please.
Whether the flash can mark things obsolete is a flash feature. At
least this is based on a flash feature. So maybe we should rename it
to MTD_CAN_FLIP_SINGLE_BITS or similar and then have
c->can_mark_obsolete = c->mtd & MTD_CAN_FLIP_SINGLE_BITS;
That would make sense, yes.
Jörn
--
The cost of changing business rules is much more expensive for software
than for a secretaty.
-- unknown
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:56 ` Jörn Engel
@ 2006-03-31 12:06 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 12:06 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Artem B. Bityutskiy
On Fri, 2006-03-31 at 13:56 +0200, Jörn Engel wrote:
> Why? It is not as if we couldn't count all the features that are
> currently used and make sense. And in the future, the number of sane
> features will still be fairly low. All we have to do is tell people
> trying to add "can_do_my_crap" that we don't want such crap in the
> kernel.
>
> Note the "and make sense" part in the second sentence. ;)
Please define the notion of "feature".
If you call "can mark obsolete" a feature, then you assume that feature
may be application-specific. This really does not make sens to add this
to MTD. Application-specific stuff must be in only application.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:59 ` Jörn Engel
@ 2006-03-31 12:11 ` Artem B. Bityutskiy
2006-03-31 12:20 ` Jörn Engel
2006-03-31 17:19 ` Nicolas Pitre
0 siblings, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 12:11 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 13:59 +0200, Jörn Engel wrote:
> Whether the flash can mark things obsolete is a flash feature. At
> least this is based on a flash feature. So maybe we should rename it
> to MTD_CAN_FLIP_SINGLE_BITS or similar and then have
This is better then "can mark obsolete". At lease this does not depend
on application. But I still think that the number of features like is
unpredictable and large. If the number of bits in mtd->flags is not
enough, will you introduce mtd->flags1? This is not very generic
approach.
It is saner to accept the type-based decomposition IMO. Type defines the
set of features. Each application does know which set of features does
it want, so it wants which types can it use.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:11 ` Artem B. Bityutskiy
@ 2006-03-31 12:20 ` Jörn Engel
2006-03-31 12:28 ` Artem B. Bityutskiy
2006-03-31 17:19 ` Nicolas Pitre
1 sibling, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 12:20 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 16:11:12 +0400, Artem B. Bityutskiy wrote:
> On Fri, 2006-03-31 at 13:59 +0200, Jörn Engel wrote:
> > Whether the flash can mark things obsolete is a flash feature. At
> > least this is based on a flash feature. So maybe we should rename it
> > to MTD_CAN_FLIP_SINGLE_BITS or similar and then have
>
> This is better then "can mark obsolete". At lease this does not depend
> on application. But I still think that the number of features like is
> unpredictable and large. If the number of bits in mtd->flags is not
> enough, will you introduce mtd->flags1? This is not very generic
> approach.
I believe the number of sane features is not very high. Right now,
jffs2 is de-facto the only thing to worry about. And last time I
checked (two weeks ago), there were not many different decisions based
on flash type.
It was just notable that decisions were based on (type==THIS ||
(type==that && flags==bla) || (type==whatnot). And I'd like to get
rid of such code.
If you send a new filesystem to be merged and that one requires 52
sane new features, you may be right.
Jörn
--
The grand essentials of happiness are: something to do, something to
love, and something to hope for.
-- Allan K. Chalmers
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:20 ` Jörn Engel
@ 2006-03-31 12:28 ` Artem B. Bityutskiy
2006-03-31 12:57 ` Jörn Engel
2006-03-31 17:22 ` Nicolas Pitre
0 siblings, 2 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 12:28 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
Jörn Engel wrote:
> I believe the number of sane features is not very high. Right now,
> jffs2 is de-facto the only thing to worry about. And last time I
> checked (two weeks ago), there were not many different decisions based
> on flash type.
>
Whatever is the de-facto user, I orient to common principles. One of
them is modularization. JFFS2 and MTD are a distinc and separate thngs.
And having anything JFFS2-specific in MTD is insane in my humble
oppinion, sorry. And this does not depend on what is de-facto.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:28 ` Artem B. Bityutskiy
@ 2006-03-31 12:57 ` Jörn Engel
2006-03-31 13:08 ` Artem B. Bityutskiy
2006-03-31 17:22 ` Nicolas Pitre
1 sibling, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-03-31 12:57 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 March 2006 16:28:45 +0400, Artem B. Bityutskiy wrote:
>
> Whatever is the de-facto user, I orient to common principles. One of
> them is modularization. JFFS2 and MTD are a distinc and separate thngs.
> And having anything JFFS2-specific in MTD is insane in my humble
> oppinion, sorry. And this does not depend on what is de-facto.
Full agreement.
Only difference is that I would grind my teeth and accept a
jffs2-specific flag in mtd, if the overall code is improved that way.
Adding 10 lines of cruft to mtd and removing 50 lines of cruft from
jffs2 would be a net win.
If someone smarter comes along and notices that the 10 lines of cruft
can be removed from mtd without reintroducing 50 lines to jffs2, even
better. Just like you showed me to use "can flip bits" instead of
"can mark obsolete".
Jörn
--
Mac is for working,
Linux is for Networking,
Windows is for Solitaire!
-- stolen from dc
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:57 ` Jörn Engel
@ 2006-03-31 13:08 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-03-31 13:08 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 14:57 +0200, Jörn Engel wrote:
> Full agreement.
>
> Only difference is that I would grind my teeth and accept a
> jffs2-specific flag in mtd, if the overall code is improved that way.
> Adding 10 lines of cruft to mtd and removing 50 lines of cruft from
> jffs2 would be a net win.
>
> If someone smarter comes along and notices that the 10 lines of cruft
> can be removed from mtd without reintroducing 50 lines to jffs2, even
> better. Just like you showed me to use "can flip bits" instead of
> "can mark obsolete".
Well, features could be implemented as a separate unit in MTD. This
unit could do the following:
1. define set of features.
2. test if a particular feature is available in a particular MTD devise.
So we would just have:
#include <mtd/feature.h>
#define jffs2_can_mark_obsolete(c) \
mtd_feature_available(c->mtd, MTD_FEATURE_CHANGE_BIT)
The "feature" module would make the right decision basing on mtd->type.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:02 ` Artem B. Bityutskiy
2006-03-31 8:05 ` Artem B. Bityutskiy
@ 2006-03-31 16:49 ` Nicolas Pitre
2006-04-02 10:51 ` Artem B. Bityutskiy
2006-04-03 4:06 ` Vitaly Wool
1 sibling, 2 replies; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 16:49 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 Mar 2006, Artem B. Bityutskiy wrote:
> Alexander Belyakov wrote:
> > > Well, probably this is a perversion and is not needed in reality, but
> > > still. I conceive it like this. Yo have 2 flashes. You as usually,
> > > calculate the resulting eraseblock size. You see at the minimal I/O unit
> > > size of both flashes and similarly calculate the resulting minimal I/O
> > > size. So that's it. You'll end up with a though perverted, but still a
> > > striped MTD device.
> >
> > First problem in case of striping NOR and NAND is a question about type of
> > striped device. Should we report it as NOR or as NAND. I believe it is
> > important for clients to know about that. Imagine, for example, device
> > reported as NAND behaves as NOR or vice versa. Another problem is a
> > difference in operation speed. Apparently you won't get any performance
> > gain. These are only top of iceberg. Note that even plain and simple
> > mtdconcat is not supposed to work with flashes of different types.
>
> Good question. I think you could report this is a striped device (introducing
> an MTD_STRIPED option). Also you may provide a stripe_get_info(struct mtd_info
> *mtd) function which will return a struct stripe_info object describing this
> striped device, including the components it consists of.
But... before going that far, is this something that really makes sense
in practice?
IMHO striping NOR and NAND together simply makes no sense. NOR and NAND
are fundamentally different things when it comes to writing to them, and
apart from evaluation boards where every possible peripheral can be
found you rarely will find both NOR and NAND in the same real life
design.
So why bother?
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 8:55 ` Artem B. Bityutskiy
@ 2006-03-31 16:59 ` Nicolas Pitre
2006-04-02 11:22 ` Artem B. Bityutskiy
0 siblings, 1 reply; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 16:59 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 31 Mar 2006, Artem B. Bityutskiy wrote:
> On Fri, 2006-03-31 at 12:17 +0400, Alexander Belyakov wrote:
> > In that case clients should be aware of using striped devices and
> > provide special support for them. The one of the ideas of the suggested
> > solution is to hide striping internals from the client providing generic
> > mtd device (just with somewhat increased performance).
>
> I think you are one more guy who hit on the not generic enough
> interface.
>
> Indeed, on the one hand, you want to inform the type of flash used. If
> this is NAND, people have to use NAND-specific stuff like
> mtd->write_ecc. If this is NOR, they have to use NOR-related stuff like
> mtd->write, mtd->point and so on.
>
> On the other hand, you still want to inform users that this is a striped
> MTD device. They may want to know this.
>
> And the best thing you can do is not to adopt yourself to far too old
> MTD interface, which is like this for historical reasons, but to make a
> revolution in the MTD interface itself.
>
> By revolution I mean:
>
> 1. To invent a common generic flash model.
> 2. To make MTD interface generic.
> 3. To fix existing users. Well, I would say fixing JFFS2 may be enough.
>
> This is a big piece of work, but it is what I consider a professional
> approach.
I'm sorry but I must disagree with your assertion above.
Yes the MTD interface might need a big revamp... but why would that be
related to stripe at all?
Why would a striped MTD device be different from, say, a partitioned MTD
device?
Part of having a professional approach is _not_ to mix everything up
together. And to that effect I don't see anything at the interface
level even in its current form that would prevent an MTD stripe module
from playing nicely with the rest of the users, just like mtd partitions
or mtdconcat.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 10:07 ` Jörn Engel
2006-03-31 10:18 ` Artem B. Bityutskiy
@ 2006-03-31 17:06 ` Nicolas Pitre
1 sibling, 0 replies; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 17:06 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1231 bytes --]
On Fri, 31 Mar 2006, Jörn Engel wrote:
> On Fri, 31 March 2006 14:00:09 +0400, Artem B. Bityutskiy wrote:
> > On Fri, 2006-03-31 at 11:40 +0200, Jörn Engel wrote:
> > > On Fri, 31 March 2006 13:36:43 +0400, Artem B. Bityutskiy wrote:
> > > >
> > > > No, mtd->type has to tell you the type of the MTD device. Ideally, this
> > > > has to me the only flash-specific field in the mtd_info structure. And
> > > > if users want to do some flash specific things, they have to look at
> > > > mtd->type, realize what is the subsystem which handles this flash, and
> > > > start working with this subsystem. For striping, this is the striping
> > > > subsystem. I don't know what for mtd->flags, probably this hast to go at
> > > > all.
> > >
> > > Is this exported to userspace via mtdchar?
> > >
> > I don't quite understand what is "this".
> >
> > Ideally yes...
>
> I take it that mtd->type is not exported to userspace yet. Which is
> good, because imho it shouldn't be. Flash should me a step towards a
> standard interface, something that hard disk have had for years.
> Knowing every little detail about every single flash chip bears more
> harm than good.
And for the record I'm completely with you here.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 11:55 ` Artem B. Bityutskiy
2006-03-31 11:59 ` Jörn Engel
@ 2006-03-31 17:14 ` Nicolas Pitre
2006-04-02 12:11 ` Artem B. Bityutskiy
1 sibling, 1 reply; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 17:14 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1661 bytes --]
On Fri, 31 Mar 2006, Artem B. Bityutskiy wrote:
> On Fri, 2006-03-31 at 13:40 +0200, Jörn Engel wrote:
> > Right now, we have code like
> >
> > #define jffs2_can_mark_obsolete(c) ((c->mtd->type == MTD_NORFLASH && !(c->mtd->flags & MTD_ECC)) || c->mtd->type == MTD_RAM)
> >
> > I guess we can both agree that it is far from optimal. And my take
> > would be to replace it with
> >
> > #define jffs2_can_mark_obsolete(c) (c->mtd->flags & MTD_CAN_MARK_OBSOLETE)
> >
> Add an c->flags field to the per-JFFS2 structure. Initialize it on mount
> properly. Then
>
> #define jffs2_can_mark_obsolete(c) (c->flags & JFFS2_CAN_MARK_OBSOLETE)
>
> Don't try to jam this to MTD please.
Artem, you have it backward.
What needs to be exported is a flash _capability_, not a flash type.
We don't care a single whit if the flash is NOR or NAND or bufferedNOR
or any other bastardized type that will come up in the future.
What MTD users like JFFS2 or mtdchar needs to know is:
- what is the minimum write size
- what is the optimal write size
- can individual bits be cleared
- can individual bits be set
- what the OOB size is
etc. That is sensible information to have and that what the actual code
implementation in JFFS2 cares about.
Otherwise, what do you do with, say, Sibley flash? Sibley is NOR but
not like the traditional definition of NOR flash. It could be seen like
buffered NOR but not exactly. It could be handled like NAND but without
any OOB data. So what is the solution: adding yet one more flash type?
Alternately it could be described fully in terms of the above
_capabilities_ which is what matters in the end.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:11 ` Artem B. Bityutskiy
2006-03-31 12:20 ` Jörn Engel
@ 2006-03-31 17:19 ` Nicolas Pitre
2006-04-02 12:34 ` Artem B. Bityutskiy
1 sibling, 1 reply; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 17:19 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
[-- Attachment #1: Type: TEXT/PLAIN, Size: 905 bytes --]
On Fri, 31 Mar 2006, Artem B. Bityutskiy wrote:
> On Fri, 2006-03-31 at 13:59 +0200, Jörn Engel wrote:
> > Whether the flash can mark things obsolete is a flash feature. At
> > least this is based on a flash feature. So maybe we should rename it
> > to MTD_CAN_FLIP_SINGLE_BITS or similar and then have
>
> This is better then "can mark obsolete". At lease this does not depend
> on application. But I still think that the number of features like is
> unpredictable and large.
No. The flash type is more unpredictable than the number of feature
which is what flash types are made of anyway. If you create a new flash
type, it is more likely to be a different combination of existing
features. And if the number of features grows, then it is very likely
that you'll have to add code in JFFS2 to cope with it anyway.
So in the end the flash type is redundent and purely informational.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 12:28 ` Artem B. Bityutskiy
2006-03-31 12:57 ` Jörn Engel
@ 2006-03-31 17:22 ` Nicolas Pitre
2006-04-03 13:06 ` Jörn Engel
1 sibling, 1 reply; 65+ messages in thread
From: Nicolas Pitre @ 2006-03-31 17:22 UTC (permalink / raw)
To: Artem B. Bityutskiy
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
[-- Attachment #1: Type: TEXT/PLAIN, Size: 826 bytes --]
On Fri, 31 Mar 2006, Artem B. Bityutskiy wrote:
> Jörn Engel wrote:
> > I believe the number of sane features is not very high. Right now,
> > jffs2 is de-facto the only thing to worry about. And last time I
> > checked (two weeks ago), there were not many different decisions based
> > on flash type.
> >
>
> Whatever is the de-facto user, I orient to common principles. One of them is
> modularization. JFFS2 and MTD are a distinc and separate thngs. And having
> anything JFFS2-specific in MTD is insane in my humble oppinion, sorry. And
> this does not depend on what is de-facto.
You are right here. This is why MTD drivers should export _flash_
features rather than filesystem requirements. And it is up to
filesystem code to cope with the (lack of) certain flash features not
the other way around.
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 16:49 ` Nicolas Pitre
@ 2006-04-02 10:51 ` Artem B. Bityutskiy
2006-04-03 4:06 ` Vitaly Wool
1 sibling, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-04-02 10:51 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
On Fri, 2006-03-31 at 11:49 -0500, Nicolas Pitre wrote:
> But... before going that far, is this something that really makes sense
> in practice?
> IMHO striping NOR and NAND together simply makes no sense. NOR and NAND
> are fundamentally different things when it comes to writing to them, and
> apart from evaluation boards where every possible peripheral can be
> found you rarely will find both NOR and NAND in the same real life
> design.
>
> So why bother?
Err, I did not offer to do this, right? I should have used the
conjunctive mood (would, could be, etc) for that, sorry.
I don't at all think it is sane, I just wanted to say that there is no
fundamental reasons why this cannot work. The other question that this
seems to be a perversion, but it still can work, in theory. And if all
is done correctly, I should be able to stripe NAND and NOR, in theory.
But again, I emphasize, I do not offer to try this combination, or to
test it.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 16:59 ` Nicolas Pitre
@ 2006-04-02 11:22 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-04-02 11:22 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Korolev, Alexey, Vitaly Wool,
Kutergin, Timofey, linux-mtd
> I'm sorry but I must disagree with your assertion above.
NP :-), I just expressed my opinion, it of course does not have to be
correct!
> Yes the MTD interface might need a big revamp... but why would that be
> related to stripe at all?
Note, I did not say: guys, I think you definitely must do that. I quote
myself: "And the best thing you can do is"... Well, probably I should
have said "And the best thing you *could do* is"? Please, remember, that
I don't have an excellent feeling of English language (yet, I hope :-))
so some things may sound tough while I do not mean this. Apologies.
I emphasize, I'm saying about how I think it would be better to do this.
If our Intel partners like my Ideas, they may make use of them. If they
don't, they surely don't have to. They may argue and show me that what
I'm talking about is insane. I understand that they may have no time to
re-work MTD.
> Why would a striped MTD device be different from, say, a partitioned MTD
> device?
I opposite, I think they must look the same way for dumb applications.
But smart applications, which may want to use some stripe-specific
things, still have to have a way to figure out that this MTD device is a
striping device. I hope you don't object here. And I would extend this
statement:
-- the MTD interface should be completely uniform, but there has to be a
uniform way to figure out the type of the underlying MTD device. Using
this type applications may start using some device-specific things, but
not via mtd_info, but using the corresponding MTD subsystem (NAND
subsystem, stripe subsystem, DataFlash subsystem, etc).
Please, look at my logical chain. Once there should be a way to
recognize the type of MTD devices, the right way to do this is to look
at the mtd->type field. And hence, it should be MTD_STRIPE.
Question: well, fine, but how will we point whether it is NOR or NAND?
Answer: Indeed, this is a problem. I think "the right" way is to get rid
of the need to realize whether this is NOR or NAND at all. To do this,
we have to to make the MTD interface more generic.
I buy your grounded criticism. But I do not see an alternative offer how
this should look from you. I assume you mean that no MTD_STRIPE stripe
should be introduced, and that mtd->type have to contain MTD_NOR,
MTD_NAND, or whatever the striped flash is.
Well, this will work. This is easier. I do not object. But this is not
perfect in my opinion.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 17:14 ` Nicolas Pitre
@ 2006-04-02 12:11 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-04-02 12:11 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
On Fri, 2006-03-31 at 12:14 -0500, Nicolas Pitre wrote:
> What needs to be exported is a flash _capability_, not a flash type.
> We don't care a single whit if the flash is NOR or NAND or bufferedNOR
> or any other bastardized type that will come up in the future.
(/me kindly reminds that this is a sort of "theoretical" conversation,
the current reality is different. And this is a bit out of striping
context even.)
My statement is: applications have the right to know (if they do really
want) which type of device (a striped thing, NAND, NOR, etc) the MTD
device represents. So, mtd_info has to provide this information anyway.
And mtd->type is an excellent field for this.
So, "We don't care a single whit if the flash is NOR or NAND or ...",
unless we really want to, right?
> What MTD users like JFFS2 or mtdchar needs to know is:
>
> - what is the minimum write size
>
> - what is the optimal write size
>
> - can individual bits be cleared
>
> - can individual bits be set
This is what I call "the MTD device model". We define a set of
operations and metrics, and then represent any flash using this model.
We don't care that we loose some particular/peculiar features of some
flash types. We are rough. But we are generic. And there is still a way
to make use of the peculiarities if there is such a need,
> - what the OOB size is
---- as mall off-topic note ----
One note about this. There is a trend to remove OOB support from MTD
info at all. Yo may want to read this thread:
http://lists.infradead.org/pipermail/linux-mtd/2006-February/014843.html
--------------------------------
>
> etc. That is sensible information to have and that what the actual code
> implementation in JFFS2 cares about.
Indeed, this is sensible information. Yes, this is very useful.
Let's think, how many features there are going to be. I think many. How
to technically add support of these features? Flags? - bad, as in case
of many features you'll have to have flags1, flags2 fields.
I emphasize, now I do not object against the features stuff. I offer a
way to technically implement them.
I still think the only device-dependent field in MTD has to be mtd->type
storing the type of flash. I offer to get rig of any "feature" flag, at
least because it is not clear how to maintain *potentially* very large
number of features.
But we still may implement a small distinct unit in MTD. This module
will return a set of features present for each particular MTD device. It
will use mtd->type for this. This is semantically the same, but
technically nicer.
Please, look here:
http://lists.infradead.org/pipermail/linux-mtd/2006-March/015200.html
> Otherwise, what do you do with, say, Sibley flash? Sibley is NOR but
> not like the traditional definition of NOR flash. It could be seen like
> buffered NOR but not exactly. It could be handled like NAND but without
> any OOB data. So what is the solution: adding yet one more flash type?
> Alternately it could be described fully in terms of the above
> _capabilities_ which is what matters in the end.
I offer to have mtd->type == MTD_SIBLEY_NOR. And add the knowledge about
Sibley to the "MTD feature unit".
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 17:19 ` Nicolas Pitre
@ 2006-04-02 12:34 ` Artem B. Bityutskiy
0 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-04-02 12:34 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd
On Fri, 2006-03-31 at 12:19 -0500, Nicolas Pitre wrote:
> So in the end the flash type is redundent and purely informational.
Imagine you're writing a cool flash file system. You want to optimize it
for Sibley NOR. According to my views, you may want to look at
mtd->type, figure out you are working with Sibley NOR, then start
directly talking to the Sybley support subsystem, and realize that
"buffer size". I mean not the minimal I/O size, but that "optimal" I/O
size you mentioned before. So, your file system may be cool and try to
do Input/Out in a more optimal way.
So, for this cool flash file system flash type is an important field.
Of course, you may also use the feature unit I offered. You may, do
things like
if (mtd_feature_available(mtd, MTD_FEATURE_SINGLE_BIT_IO))
etc.
My picture looks very shiny for me :-)
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 16:49 ` Nicolas Pitre
2006-04-02 10:51 ` Artem B. Bityutskiy
@ 2006-04-03 4:06 ` Vitaly Wool
2006-04-03 6:04 ` Thomas Gleixner
2006-04-03 13:44 ` Nicolas Pitre
1 sibling, 2 replies; 65+ messages in thread
From: Vitaly Wool @ 2006-04-03 4:06 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Artem B. Bityutskiy, Alexander Belyakov, linux-mtd,
Kutergin, Timofey, Korolev, Alexey
Nicolas Pitre wrote:
> IMHO striping NOR and NAND together simply makes no sense. NOR and NAND
> are fundamentally different things when it comes to writing to them, and
> apart from evaluation boards where every possible peripheral can be
> found you rarely will find both NOR and NAND in the same real life
> design.
>
Oh really? What about MP3 player-oriented design with NAND flash as a
main storage and NOR flash for kernel/userspace XIP etc?
Vitaly
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 4:06 ` Vitaly Wool
@ 2006-04-03 6:04 ` Thomas Gleixner
2006-04-03 6:14 ` Vitaly Wool
` (2 more replies)
2006-04-03 13:44 ` Nicolas Pitre
1 sibling, 3 replies; 65+ messages in thread
From: Thomas Gleixner @ 2006-04-03 6:04 UTC (permalink / raw)
To: Vitaly Wool
Cc: Alexander Belyakov, Kutergin, Timofey, Korolev, Alexey, linux-mtd,
Artem B. Bityutskiy, Nicolas Pitre
On Mon, 2006-04-03 at 08:06 +0400, Vitaly Wool wrote:
> Nicolas Pitre wrote:
> > IMHO striping NOR and NAND together simply makes no sense. NOR and NAND
> > are fundamentally different things when it comes to writing to them, and
> > apart from evaluation boards where every possible peripheral can be
> > found you rarely will find both NOR and NAND in the same real life
> > design.
> >
> Oh really? What about MP3 player-oriented design with NAND flash as a
> main storage and NOR flash for kernel/userspace XIP etc?
Granted, but you can not mix the usage of those chips.
Functions like concat or striping can only be used with FLASH of the
same type. NAND and NOR are so fundamentally different it wont work
without some ugly hack around. There is no point to even think about
that.
Also striping on NAND is a seperate topic. Most new hardware designs
have NAND controllers included which provide e.g. hardware based ECC.
Most of the controllers I'm aware of are not really suitable for
striping due to their design. Also striping would require a fundamental
change to the NAND code, as it currently serializes the access to shared
hardware controllers. This seralization needs to be carefully redesigned
to allow striping and even then it depends on the controller and the
overall hardware design (most designs have OR-wired ready/busy pins)
whether it's possible or not.
I have not looked at the patch closely - and I will not until it is in
an acceptable form - but I have the feeling that the striping support
needs more than a bunch of hacks to the core mtd chip support if we do
not want to end up with a complete unmaintainable mess.
tglx
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 6:04 ` Thomas Gleixner
@ 2006-04-03 6:14 ` Vitaly Wool
2006-04-03 6:21 ` Thomas Gleixner
2006-04-03 6:59 ` Artem B. Bityutskiy
2006-04-03 7:20 ` Alexander Belyakov
2 siblings, 1 reply; 65+ messages in thread
From: Vitaly Wool @ 2006-04-03 6:14 UTC (permalink / raw)
To: tglx
Cc: Alexander Belyakov, Kutergin, Timofey, Korolev, Alexey, linux-mtd,
Artem B. Bityutskiy, Nicolas Pitre
Thomas Gleixner wrote:
> Functions like concat or striping can only be used with FLASH of the
> same type. NAND and NOR are so fundamentally different it wont work
> without some ugly hack around. There is no point to even think about
> that.
>
I'm afraid I can't object against that ;)
> Also striping on NAND is a seperate topic. Most new hardware designs
> have NAND controllers included which provide e.g. hardware based ECC.
> Most of the controllers I'm aware of are not really suitable for
> striping due to their design. Also striping would require a fundamental
> change to the NAND code, as it currently serializes the access to shared
> hardware controllers. This seralization needs to be carefully redesigned
> to allow striping and even then it depends on the controller and the
> overall hardware design (most designs have OR-wired ready/busy pins)
> whether it's possible or not.
>
>
Given that some modern NAND controllers have the ability to generate
interrupt when they're done, I would think about complete redesign of
the MTD NAND layer. I'd like to see the fully asynchronous base model
here (i. e. mtd->send_write_cmd/send_read_cmd or something similar) and
synchronous interface on top of that, just like, say, the current SPI
core works.
This would allow to be more flexible in waiting for completion and also
would IMO make striping implementation for NAND more straightforward.
Does that make sense?
Vitaly
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 6:14 ` Vitaly Wool
@ 2006-04-03 6:21 ` Thomas Gleixner
0 siblings, 0 replies; 65+ messages in thread
From: Thomas Gleixner @ 2006-04-03 6:21 UTC (permalink / raw)
To: Vitaly Wool
Cc: Alexander Belyakov, Kutergin, Timofey, Korolev, Alexey, linux-mtd,
Artem B. Bityutskiy, Nicolas Pitre
On Mon, 2006-04-03 at 10:14 +0400, Vitaly Wool wrote:
> Given that some modern NAND controllers have the ability to generate
> interrupt when they're done, I would think about complete redesign of
> the MTD NAND layer. I'd like to see the fully asynchronous base model
> here (i. e. mtd->send_write_cmd/send_read_cmd or something similar) and
> synchronous interface on top of that, just like, say, the current SPI
> core works.
> This would allow to be more flexible in waiting for completion and also
> would IMO make striping implementation for NAND more straightforward.
> Does that make sense?
In general yes, but it does not solve the problem, where you have _ONE_
shared line for ready/busy -> interrupt for all chips connected to the
hardware controller, nor does it solve the general serialization
requirements to access the controller.
tglx
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 6:04 ` Thomas Gleixner
2006-04-03 6:14 ` Vitaly Wool
@ 2006-04-03 6:59 ` Artem B. Bityutskiy
2006-04-03 7:20 ` Alexander Belyakov
2 siblings, 0 replies; 65+ messages in thread
From: Artem B. Bityutskiy @ 2006-04-03 6:59 UTC (permalink / raw)
To: tglx
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Nicolas Pitre
Thomas Gleixner wrote:
> Functions like concat or striping can only be used with FLASH of the
> same type. NAND and NOR are so fundamentally different it wont work
> without some ugly hack around. There is no point to even think about
> that.
Striping different flash type is an utterly insane perversion indeed.
But still, I imagine 2 MTD devices, one NAND, one NOR. Both can read and
write. We select the correct eraseblock size and the minimal I/O unit of
the resulting striped flash (probably much space will be wasted). We
direct writes to different threads in correspondence to the interleave
size. And this ugly monster should work in theory! :-)
So, conversely, this ugliness will work if all is done in a generic
manner and there are *no* ugly hacks! Funny :-) But anyway, let's drop
the subject.
--
Best Regards,
Artem B. Bityutskiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 6:04 ` Thomas Gleixner
2006-04-03 6:14 ` Vitaly Wool
2006-04-03 6:59 ` Artem B. Bityutskiy
@ 2006-04-03 7:20 ` Alexander Belyakov
2 siblings, 0 replies; 65+ messages in thread
From: Alexander Belyakov @ 2006-04-03 7:20 UTC (permalink / raw)
To: tglx
Cc: Vitaly Wool, Kutergin, Timofey, Korolev, Alexey, linux-mtd,
Artem B. Bityutskiy, Nicolas Pitre
Thomas Gleixner wrote:
>> Oh really? What about MP3 player-oriented design with NAND flash as a
>> main storage and NOR flash for kernel/userspace XIP etc?
>
> Granted, but you can not mix the usage of those chips.
>
> Functions like concat or striping can only be used with FLASH of the
> same type. NAND and NOR are so fundamentally different it wont work
> without some ugly hack around. There is no point to even think about
> that.
That's my point too. Moreover there is no reason to stripe NOR and NAND.
Even if you were able to make "some ugly hacks" to configure it you wont
get any performance increase due to significantly different speed of
those chips. Striped device will work with the speed of the slowest
sub-device.
> I have the feeling that the striping support
> needs more than a bunch of hacks to the core mtd chip support if we do
> not want to end up with a complete unmaintainable mess.
Originally striping layer has been developed for NOR and Sibley flashes
which quite slow. Striping for NOR is quite simple despite it requires
some changes in command sets implementation.
If you look at my patches you find nothing that changes MTD NAND
subsystem. Only interleaving algorithm (virtual pages merging) and
worker threads queues have been implemented for NAND flashes. It was not
possible to check striped NAND performance gain due to some hardware
limitations we have - but I believe NAND striping will come across the
same problems with thread switching as we had for NOR devices. Anyway I
feel that in general NAND striping wont be so simple as NOR striping.
Thanks,
Alexander
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-03-31 17:22 ` Nicolas Pitre
@ 2006-04-03 13:06 ` Jörn Engel
2006-04-03 13:18 ` Jörn Engel
2006-04-04 1:41 ` Josh Boyer
0 siblings, 2 replies; 65+ messages in thread
From: Jörn Engel @ 2006-04-03 13:06 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Artem B. Bityutskiy
On Fri, 31 March 2006 12:22:50 -0500, Nicolas Pitre wrote:
>
> You are right here. This is why MTD drivers should export _flash_
> features rather than filesystem requirements. And it is up to
> filesystem code to cope with the (lack of) certain flash features not
> the other way around.
As long as noone is being a fundamentalist zealot, we all agree here.
My fine point of disagreement is that I stress "should" above
everything else. Quite a few things can and should be improved about
mtd and its users. If everything were perfect, it would all be about
flash features. Until then, it may make sense to _temporarily_ hold
some filesystem features. While this is undeniably a bad thing in its
own, it can allow you to get rid of bigger warts.
Once the bigger warts are gone, we can aim for perfect. ;)
As a step in that direction, please take a look at this patchset:
http://wh.fh-wedel.de/~joern/mtd_type.tgzwh.fh-wedel.de/~joern/mtd_type.tgz
It removes all types except MTD_ABSENT and all flags but MTD_OOB. As
a replacement, three new flags are introduced. So now we're at a
total of 4 flags (previously 9) and two types (previously 9).
Jörn
--
...one more straw can't possibly matter...
-- Kirby Bakken
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 13:06 ` Jörn Engel
@ 2006-04-03 13:18 ` Jörn Engel
2006-04-04 1:39 ` Josh Boyer
2006-04-04 1:41 ` Josh Boyer
1 sibling, 1 reply; 65+ messages in thread
From: Jörn Engel @ 2006-04-03 13:18 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Artem B. Bityutskiy
On Mon, 3 April 2006 15:06:19 +0200, Jörn Engel wrote:
>
> As a step in that direction, please take a look at this patchset:
> http://wh.fh-wedel.de/~joern/mtd_type.tgzwh.fh-wedel.de/~joern/mtd_type.tgz
Taking a closer look at the code, reader should notice three
interesting things, btw:
1. Nico is a lazy moron,
2. Artem is a lazy moron and
3. Jörn is a lazy moron.
Nico, you didn't notice that the Sibley flashes are just another
iteration of ST's ECC NOR flashes. Special code for both flash types
is nearly identical, except for the field used to store the writesize
in struct mtd_info.
Artem, you didn't notice that Dataflash has the exact same cleanup
routine as ECC NOR and NAND flash has. And for the initialization
code, that could be shared as well.
Jörn, I didn't notice that ECC NOR has the exact same cleanup routine
as NAND flash. And for the initialization code, that could be shared
as well.
Overall, JFFS2 currently has four sets of special setup/cleanup code
for NAND, ECC NOR, Dataflash and Sibley. Simply because the three of
us were too lazy to read the existing code and didn't notice. Noone
else seemed to have noticed either, so the "lazy moron" thing seems to
be fairly universal. Don't take it as an insult, but rather as a
statement about human nature (and a trick to attract attention).
Nico, can you take a close look at the Sibley part of the patches and
checker whether I broke anything? Combining ECC NOR and Sibley was
the low hanging fruit, so I already picked it. Combining the three
remaining sets is harder. Maybe someone else is feeling the urge to
combine those.
Jörn
--
He who knows that enough is enough will always have enough.
-- Lao Tsu
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 4:06 ` Vitaly Wool
2006-04-03 6:04 ` Thomas Gleixner
@ 2006-04-03 13:44 ` Nicolas Pitre
1 sibling, 0 replies; 65+ messages in thread
From: Nicolas Pitre @ 2006-04-03 13:44 UTC (permalink / raw)
To: Vitaly Wool
Cc: Artem B. Bityutskiy, Alexander Belyakov, linux-mtd,
Kutergin, Timofey, Korolev, Alexey
On Mon, 3 Apr 2006, Vitaly Wool wrote:
> Nicolas Pitre wrote:
> > IMHO striping NOR and NAND together simply makes no sense. NOR and NAND are
> > fundamentally different things when it comes to writing to them, and apart
> > from evaluation boards where every possible peripheral can be found you
> > rarely will find both NOR and NAND in the same real life design.
> >
> Oh really? What about MP3 player-oriented design with NAND flash as a main
> storage and NOR flash for kernel/userspace XIP etc?
You just demonstrated above why you'd never stripe those. ;-)
Nicolas
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 13:18 ` Jörn Engel
@ 2006-04-04 1:39 ` Josh Boyer
0 siblings, 0 replies; 65+ messages in thread
From: Josh Boyer @ 2006-04-04 1:39 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Artem B. Bityutskiy, Nicolas Pitre
On 4/3/06, Jörn Engel <joern@wohnheim.fh-wedel.de> wrote:
> On Mon, 3 April 2006 15:06:19 +0200, Jörn Engel wrote:
> >
> > As a step in that direction, please take a look at this patchset:
> > http://wh.fh-wedel.de/~joern/mtd_type.tgzwh.fh-wedel.de/~joern/mtd_type.tgz
>
> Taking a closer look at the code, reader should notice three
> interesting things, btw:
>
> 1. Nico is a lazy moron,
> 2. Artem is a lazy moron and
> 3. Jörn is a lazy moron.
There should also be a Josh is a lazy moron. I merged the NOR ECC
stuff in JFFS2.
> Nico, you didn't notice that the Sibley flashes are just another
> iteration of ST's ECC NOR flashes. Special code for both flash types
> is nearly identical, except for the field used to store the writesize
> in struct mtd_info.
I've been thinking about that for a while now. Real life has
intervened for the most part.
>
> Jörn, I didn't notice that ECC NOR has the exact same cleanup routine
> as NAND flash. And for the initialization code, that could be shared
> as well.
And a large majority of the ECC NOR stuff could have just been merged
into cfi_cmdset_0001.c instead of writing a whole new file. Indeed,
it really has to be because the larger 16MiB chips _report_ a command
set of 0001 even though they aren't. I have code that does all this,
but it's based on old MTD and wants sanitizing.
> Nico, can you take a close look at the Sibley part of the patches and
> checker whether I broke anything? Combining ECC NOR and Sibley was
> the low hanging fruit, so I already picked it. Combining the three
> remaining sets is harder. Maybe someone else is feeling the urge to
> combine those.
See above :). I'm off traveling for work at the moment, but I'll try
to look at the patchset at some point next week.
josh
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH/RFC] MTD: Striping layer core
2006-04-03 13:06 ` Jörn Engel
2006-04-03 13:18 ` Jörn Engel
@ 2006-04-04 1:41 ` Josh Boyer
1 sibling, 0 replies; 65+ messages in thread
From: Josh Boyer @ 2006-04-04 1:41 UTC (permalink / raw)
To: Jörn Engel
Cc: Alexander Belyakov, Vitaly Wool, Kutergin, Timofey,
Korolev, Alexey, linux-mtd, Artem B. Bityutskiy, Nicolas Pitre
On 4/3/06, Jörn Engel <joern@wohnheim.fh-wedel.de> wrote:
>
> As a step in that direction, please take a look at this patchset:
> http://wh.fh-wedel.de/~joern/mtd_type.tgzwh.fh-wedel.de/~joern/mtd_type.tgz
>
> It removes all types except MTD_ABSENT and all flags but MTD_OOB. As
> a replacement, three new flags are introduced. So now we're at a
> total of 4 flags (previously 9) and two types (previously 9).
Can we pull this out into a different thread. I almost lost it
because I have long sinced stopped caring about the stripping subject
until others come to some form of consensus :).
josh
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2006-04-04 1:41 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-30 7:57 [PATCH/RFC] MTD: Striping layer core Belyakov, Alexander
2006-03-30 9:06 ` Vitaly Wool
2006-03-30 11:50 ` Artem B. Bityutskiy
2006-03-30 12:15 ` Vitaly Wool
2006-03-30 15:24 ` Alexander Belyakov
2006-03-30 15:39 ` Artem B. Bityutskiy
2006-03-31 7:06 ` Alexander Belyakov
2006-03-31 8:02 ` Artem B. Bityutskiy
2006-03-31 8:05 ` Artem B. Bityutskiy
2006-03-31 8:17 ` Alexander Belyakov
2006-03-31 8:38 ` Artem B. Bityutskiy
2006-03-31 8:55 ` Artem B. Bityutskiy
2006-03-31 16:59 ` Nicolas Pitre
2006-04-02 11:22 ` Artem B. Bityutskiy
2006-03-31 9:27 ` Jörn Engel
2006-03-31 9:36 ` Artem B. Bityutskiy
2006-03-31 9:40 ` Jörn Engel
2006-03-31 10:00 ` Artem B. Bityutskiy
2006-03-31 10:06 ` Artem B. Bityutskiy
2006-03-31 10:07 ` Jörn Engel
2006-03-31 10:18 ` Artem B. Bityutskiy
2006-03-31 11:40 ` Jörn Engel
2006-03-31 11:47 ` Artem B. Bityutskiy
2006-03-31 11:56 ` Jörn Engel
2006-03-31 12:06 ` Artem B. Bityutskiy
2006-03-31 11:55 ` Artem B. Bityutskiy
2006-03-31 11:59 ` Jörn Engel
2006-03-31 12:11 ` Artem B. Bityutskiy
2006-03-31 12:20 ` Jörn Engel
2006-03-31 12:28 ` Artem B. Bityutskiy
2006-03-31 12:57 ` Jörn Engel
2006-03-31 13:08 ` Artem B. Bityutskiy
2006-03-31 17:22 ` Nicolas Pitre
2006-04-03 13:06 ` Jörn Engel
2006-04-03 13:18 ` Jörn Engel
2006-04-04 1:39 ` Josh Boyer
2006-04-04 1:41 ` Josh Boyer
2006-03-31 17:19 ` Nicolas Pitre
2006-04-02 12:34 ` Artem B. Bityutskiy
2006-03-31 17:14 ` Nicolas Pitre
2006-04-02 12:11 ` Artem B. Bityutskiy
2006-03-31 17:06 ` Nicolas Pitre
2006-03-31 16:49 ` Nicolas Pitre
2006-04-02 10:51 ` Artem B. Bityutskiy
2006-04-03 4:06 ` Vitaly Wool
2006-04-03 6:04 ` Thomas Gleixner
2006-04-03 6:14 ` Vitaly Wool
2006-04-03 6:21 ` Thomas Gleixner
2006-04-03 6:59 ` Artem B. Bityutskiy
2006-04-03 7:20 ` Alexander Belyakov
2006-04-03 13:44 ` Nicolas Pitre
2006-03-30 10:35 ` Artem B. Bityutskiy
2006-03-30 15:38 ` Alexander Belyakov
2006-03-30 16:32 ` Nicolas Pitre
2006-03-30 16:38 ` Artem B. Bityutskiy
2006-03-30 16:56 ` Jared Hulbert
2006-03-30 17:03 ` Artem B. Bityutskiy
2006-03-31 7:19 ` Alexander Belyakov
2006-03-30 12:11 ` Jörn Engel
2006-03-31 6:52 ` Alexander Belyakov
2006-03-31 7:57 ` Artem B. Bityutskiy
2006-03-31 8:11 ` Alexander Belyakov
2006-03-31 8:31 ` Artem B. Bityutskiy
2006-03-31 8:35 ` Alexander Belyakov
2006-03-31 8:47 ` Jörn Engel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox