From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tao Ma Date: Wed, 30 Apr 2008 15:18:19 +0800 Subject: [Ocfs2-devel] [patch 0/2] Add inode steal in ocfs2-1.2 In-Reply-To: <20080430070805.GA11248@dhcp-beijing-cdc-10-182-121-54.cn.oracle .com> References: <20080430070805.GA11248@dhcp-beijing-cdc-10-182-121-54.cn.oracle .com> Message-ID: <48181D3B.3070503@oracle.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com One more thing, it has been tested with my script which has been sent to ocfs2-tools-devel for review. Please see http://oss.oracle.com/pipermail/ocfs2-tools-devel/2008-March/000621.html Regards, Tao tao.ma at oracle.com wrote: > Hi all, > This patch series add inode steal mechanism for inode allocation and > it is backported from 2.6.26. > > In OCFS2, we allocate the inodes from slot specific inode_alloc to avoid > inode creation congestion. The local alloc file grows in a large contiguous > chunk. As for a 4K bs, it grows 4M every time. So 1024 inodes will be > allocated at a time. > > Over time, if the fs gets fragmented enough(e.g, the user has created > many small files and also delete some of them), we can end up in a situation, > whereby we cannot extend the inode_alloc as we don't have a large chunk > free in the global_bitmap even if df shows few gigs free. More annoying > is that this situation will invariably mean that while one cannot create > inodes on one node but can from another node. Still more annoying is that an > unused slot may have space for plenty of inodes but is unusable as the user may > not be mounting as many nodes anymore. > > This patch series implement a solution which is to steal inodes from > another slot. Now the whole inode allocation process looks like this: > 1. Allocate from its own inode_alloc:000X > 1) If we can reserve, OK. > 2) If fails, try to allocate a large chunk and reserve once again. > 2. If 1 fails, try to allocate from the ocfs2_super->inode_steal_slot. > This time, Just try to reserve, we don't go for global_bitmap if > this inode also can't allocate the inode. > 3. If 2 fails, try the node next until we reach that steal slot again. > > ocfs2_super->inode_steal_slot is initalized as the node next to our own > slot. And once the inode stealing successes, we will refresh it with > the slot we steal inode from. It will also be reinitalized when the > local truncate log or local alloc recovery is flushed in which case the global > bitmap may be refreshed. >