All of lore.kernel.org
 help / color / mirror / Atom feed
* improving Reiserfs Performance
@ 2005-05-25 12:54 Jos Houtman
       [not found] ` <4294783C.5040001@darthvader.us>
  0 siblings, 1 reply; 7+ messages in thread
From: Jos Houtman @ 2005-05-25 12:54 UTC (permalink / raw)
  To: reiserfs-list

Hello list,

First of all, We are a website that provide picture albums to its users.
At moment we host almost 2 million, which are served in 5 different 
formats from icons to 700x500.
We store all there files on our NAS server and serve these with SQUID 
proxys.

But we are having performance problems, and I'am orientating myself for 
possible fields of improvement.

Our setup is as follows.
A dual 2Ghz xeon, with 1 GB memory.
2x  3ware 9000 cards with 8 SATA disks each.
  - Each disk set is configured for raid 5 with one hot spare disk.
    This gives us 2x 2.2TB partitions, of which we use about 500GB, but 
keeps growing.

Our directory structure is as follows:
/mnt/raid1/ORIGINALS/ (original uploaded picture's only used for rendering)
/mnt/raid1/RENDERED/  (Resized foto's)

in each of these directory's we create a subdirectory per 50.000 photo's.
1-50000, 50001-100000, etc.

This means that ORIGINALS subdirs contains 50.000 files max,
but the RENDERED subdir's contain 250.000 (5x50.000) files maximal.
We resize the picture's on demand so the actual amount is variable.
Avarage file size of the rendered files is i think are 20 to 25KB.

In peak hours about 50 to 100 files per sec are read requested over NFS.
what percentage is write i dont know, but i would guess about 10 to 20%.
This seems to be too much for the server, and therefor iam trying to 
improve things.

I'am anything but an expert on filesystems and disks, so maybe i got 
some weird idea's
but iam going to shoot them in the hope that some of your experience 
rubs of on me.
Possible ideas i had:
- Changing to raid10, this should improve read performance.
- switching to reiserfs4, the benchmark on the site showed improvement. 
but is it stable enough yet?
- moving the reiserfs journal to another device (will this matter?)
- changing journal size?
- Changing the amount of picture's in a subdir?
  Is there a optimal amount after which it would be better to create 
more top dirs?
- caching of the directory structure in memory?
- More memory so that linux can cache more, but i dont really think this 
will help because a wide variaty of files is requested and not much 
files are requested twice.
- changing the setup into more (smaller) disk arrays and dividing the 
files over these. (but how to provide a consistent view? like it is now)

Do you guys/girls have additional idea's? (Maybe our setup is totally wrong)

Iam happy to provide more details if necessary.


-Bows-

Jos Houtman





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: improving Reiserfs Performance
       [not found]   ` <42947BCA.9080702@hyves.nl>
@ 2005-05-25 15:55     ` Yiannis Mavroukakis
  2005-05-25 16:24       ` Jos Houtman
                         ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Yiannis Mavroukakis @ 2005-05-25 15:55 UTC (permalink / raw)
  To: Jos Houtman, reiserfs-list

Jos Houtman wrote:

> Yiannis Mavroukakis wrote:
>
>> Jos Houtman wrote:
>>
>>> Hello list,
>>>
>>> First of all, We are a website that provide picture albums to its 
>>> users.
>>> At moment we host almost 2 million, which are served in 5 different 
>>> formats from icons to 700x500.
>>> We store all there files on our NAS server and serve these with 
>>> SQUID proxys.
>>>
>>> But we are having performance problems, and I'am orientating myself 
>>> for possible fields of improvement.
>>>
>> [snip]
>> Have you considered the fact that your bottleneck may be over NFS?
>>
>> Y.
>
>
> I most certainly did, and it is a problem, but iam trying to work my 
> way bottom up.
> It's a bit hard to say anything about NFS performance while the 
> filesystem/disks
> could be causing the majority of the delays.
>
> After i did my best on the this level i intend to look at NFS in more 
> depth, but any hints/tips you allready have are welcome.
>
> jos
>
>
ReiserFS actually performs the best under heavy hammering and thousands 
of files IMHO ;)
Changing to RAID10 will improve performance
Reiser4: again IMHO not yet, although your site would make a nice test 
bed :)
Have you done any localised (i.e. not over a network) testing on the 
filesystem? If not, you'll be stabing in the dark for reasons, at least 
in my book.
Do you have a stock kernel or have you 'custom' compiled yours?


Y

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: improving Reiserfs Performance
  2005-05-25 15:55     ` Yiannis Mavroukakis
@ 2005-05-25 16:24       ` Jos Houtman
  2005-05-25 17:22       ` ReiserFS on large-scale flash Artem B. Bityuckiy
  2005-05-26  0:36       ` improving Reiserfs Performance John Dong
  2 siblings, 0 replies; 7+ messages in thread
From: Jos Houtman @ 2005-05-25 16:24 UTC (permalink / raw)
  To: Yiannis Mavroukakis; +Cc: reiserfs-list

Yiannis Mavroukakis wrote:

> Jos Houtman wrote:
>
>> Yiannis Mavroukakis wrote:
>>
>>> Jos Houtman wrote:
>>>
>>>> Hello list,
>>>>
>>>> First of all, We are a website that provide picture albums to its 
>>>> users.
>>>> At moment we host almost 2 million, which are served in 5 different 
>>>> formats from icons to 700x500.
>>>> We store all there files on our NAS server and serve these with 
>>>> SQUID proxys.
>>>>
>>>> But we are having performance problems, and I'am orientating myself 
>>>> for possible fields of improvement.
>>>>
>>> [snip]
>>> Have you considered the fact that your bottleneck may be over NFS?
>>>
>>> Y.
>>
>>
>>
>> I most certainly did, and it is a problem, but iam trying to work my 
>> way bottom up.
>> It's a bit hard to say anything about NFS performance while the 
>> filesystem/disks
>> could be causing the majority of the delays.
>>
>> After i did my best on the this level i intend to look at NFS in more 
>> depth, but any hints/tips you allready have are welcome.
>>
>> jos
>>
>>
> ReiserFS actually performs the best under heavy hammering and 
> thousands of files IMHO ;)

This i acknowledge, it proved to be alot better then ext3.

> Changing to RAID10 will improve performance
> Reiser4: again IMHO not yet, although your site would make a nice test 
> bed :)
> Have you done any localised (i.e. not over a network) testing on the 
> filesystem? If not, you'll be stabing in the dark for reasons, at 
> least in my book.

Not yet, we have a spare machine on which i can run some localised 
tests, but iam still copying the backup to it.
I did try some public available performance tests but since I can not 
tailer it enough to our situation i dont consider the results reliable.

Iam thinking about making a perl scripts that does the following:
- forks an x number of times. Where X would represent the number of nfsd 
daemons or maybe the nr of clients.
- each forks requests a number of random images.
I could simply measure the number of files openen, and the data read 
divided by the execution time.

but i haven't gotten around to writing it yet.

something to give an indicator though, currently iam scanning through 
some directory's in a while loop doing a bash file exists test [ -e ] to 
detect which files are missing (we had some trouble yesterday). iam 
doing this on the live system during peak hours.
though vmstat indicates that it reads an avarage 400 blocks per second 
which is quite low i think.
On the other machine to which iam currently copying the backup from a 
usb drive. i get about 14000 blocks per second.


> Do you have a stock kernel or have you 'custom' compiled yours?

I have a custom kernel with the latest 3ware drivers.
not necessarily truly optimized, iam no expert (yet).

Another thing i though of was trying another IO scheduling technique.
I remember reading in the kernel documentation that the deadlock 
scheduling could give a better performance when using read 10.

does anybody have experience with this? or seem some performance testing 
with it?



> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> ______________________________________________________________________



^ permalink raw reply	[flat|nested] 7+ messages in thread

* ReiserFS on large-scale flash
  2005-05-25 15:55     ` Yiannis Mavroukakis
  2005-05-25 16:24       ` Jos Houtman
@ 2005-05-25 17:22       ` Artem B. Bityuckiy
  2005-05-25 17:31         ` Hans Reiser
  2005-05-25 17:33         ` Artem B. Bityuckiy
  2005-05-26  0:36       ` improving Reiserfs Performance John Dong
  2 siblings, 2 replies; 7+ messages in thread
From: Artem B. Bityuckiy @ 2005-05-25 17:22 UTC (permalink / raw)
  To: reiserfs-list

Hello,

I'm designing new flash file system. And I'm thinking about the 
possibility to write plugins for ReiserFS to implement it (I know, it 
sounds crazy). I didn't thoroughly explore is it possible or not yet.

I'm almost certain that it is impossible to do this with the current 
Reiser4 and more changes are needed. Flash devices just has another 
model, not the same as block devices. In a nutshell, you can't write 
twice to the same block (out-of-place writing property) and you must 
erase a several consecutive blocks before re-using any block (read here: 
http://en.wikipedia.org/wiki/Flash_memory).

Currently there are no Flash file systems which are usable on large 
scale flashes, at least in Linux. And there is a real need in such FS.

So, I'd like to know, are Reisre4 developers interested in this? Will I 
have some assistance from them if I find some Reiser4 limitation? Is it 
worth starting exploring Reiser4 or better just start new FS?

Thanks.

-- 
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ReiserFS on large-scale flash
  2005-05-25 17:22       ` ReiserFS on large-scale flash Artem B. Bityuckiy
@ 2005-05-25 17:31         ` Hans Reiser
  2005-05-25 17:33         ` Artem B. Bityuckiy
  1 sibling, 0 replies; 7+ messages in thread
From: Hans Reiser @ 2005-05-25 17:31 UTC (permalink / raw)
  To: Artem B. Bityuckiy; +Cc: reiserfs-list

Artem B. Bityuckiy wrote:

> Hello,
>
> I'm designing new flash file system. And I'm thinking about the
> possibility to write plugins for ReiserFS to implement it (I know, it
> sounds crazy). I didn't thoroughly explore is it possible or not yet.
>
> I'm almost certain that it is impossible to do this with the current
> Reiser4 and more changes are needed. Flash devices just has another
> model, not the same as block devices. In a nutshell, you can't write
> twice to the same block (out-of-place writing property) and you must
> erase a several consecutive blocks before re-using any block (read
> here: http://en.wikipedia.org/wiki/Flash_memory).
>
> Currently there are no Flash file systems which are usable on large
> scale flashes, at least in Linux. And there is a real need in such FS.
>
> So, I'd like to know, are Reisre4 developers interested in this? Will
> I have some assistance from them if I find some Reiser4 limitation? Is
> it worth starting exploring Reiser4 or better just start new FS?
>
> Thanks.
>
Sure, go for it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ReiserFS on large-scale flash
  2005-05-25 17:22       ` ReiserFS on large-scale flash Artem B. Bityuckiy
  2005-05-25 17:31         ` Hans Reiser
@ 2005-05-25 17:33         ` Artem B. Bityuckiy
  1 sibling, 0 replies; 7+ messages in thread
From: Artem B. Bityuckiy @ 2005-05-25 17:33 UTC (permalink / raw)
  To: Artem B. Bityuckiy; +Cc: reiserfs-list

Just few refinements.

Traditionally an emulation layer is created on top of flash. This layer 
hides Flash limitations and emulates block device on top of flash. But 
this approach is highly inefficient and is not usable in most system. We 
need an FS which is aware of the differences. Like JFFS2, but JFFS2 
doesn't scale (not usable starting from about 256MiB flash).

Artem B. Bityuckiy wrote:
> Hello,
> 
> I'm designing new flash file system. And I'm thinking about the 
> possibility to write plugins for ReiserFS to implement it (I know, it 
> sounds crazy). I didn't thoroughly explore is it possible or not yet.
> 
> I'm almost certain that it is impossible to do this with the current 
> Reiser4 and more changes are needed. Flash devices just has another 
> model, not the same as block devices. In a nutshell, you can't write 
> twice to the same block (out-of-place writing property) and you must 
> erase a several consecutive blocks before re-using any block (read here: 
> http://en.wikipedia.org/wiki/Flash_memory).
> 
> Currently there are no Flash file systems which are usable on large 
> scale flashes, at least in Linux. And there is a real need in such FS.
> 
> So, I'd like to know, are Reisre4 developers interested in this? Will I 
> have some assistance from them if I find some Reiser4 limitation? Is it 
> worth starting exploring Reiser4 or better just start new FS?
> 
> Thanks.
> 

-- 
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: improving Reiserfs Performance
  2005-05-25 15:55     ` Yiannis Mavroukakis
  2005-05-25 16:24       ` Jos Houtman
  2005-05-25 17:22       ` ReiserFS on large-scale flash Artem B. Bityuckiy
@ 2005-05-26  0:36       ` John Dong
  2 siblings, 0 replies; 7+ messages in thread
From: John Dong @ 2005-05-26  0:36 UTC (permalink / raw)
  To: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 952 bytes --]

> 
> 
> ReiserFS actually performs the best under heavy hammering and thousands
> of files IMHO ;)

 One exception: In periods of extremely heavy read activity, writes do get 
choked. And also vice versa. I can strongly feel this when doing some major 
dd'ing, which probably won't simulate your load at all. The deadline I/O 
scheduler did help out, though, for some reason. (my imagination?)

> Reiser4: again IMHO not yet, although your site would make a nice test
> bed :)

 I agree with this statement. I ran reiser4 on 2 of my system. In the
2.6.8days, one reiser4 volume unexpectedly started exhibiting random
hangs. fsck
didn't find anything, and the hangs were VERY hard to reproduce, if at all. 
(sometimes APT's building dependencies tree would spark it, but it's like a 
.1% chance). The other one, reiser4's working fine
(2.6.11.10<http://2.6.11.10>),
but I really miss xattr (Beagle). Oh well, can't have everything.

[-- Attachment #2: Type: text/html, Size: 1278 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-05-26  0:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-25 12:54 improving Reiserfs Performance Jos Houtman
     [not found] ` <4294783C.5040001@darthvader.us>
     [not found]   ` <42947BCA.9080702@hyves.nl>
2005-05-25 15:55     ` Yiannis Mavroukakis
2005-05-25 16:24       ` Jos Houtman
2005-05-25 17:22       ` ReiserFS on large-scale flash Artem B. Bityuckiy
2005-05-25 17:31         ` Hans Reiser
2005-05-25 17:33         ` Artem B. Bityuckiy
2005-05-26  0:36       ` improving Reiserfs Performance John Dong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.