Unsolved
This post is more than 5 years old
12 Posts
0
4629
January 6th, 2010 00:00
How to increase performance: storing large number of small files?
We are about to replace FileNet with Centera as archiving system in many of our applications.
Some of our applications store 10 files per second (average size ~ 40 kB).
Via Centera the performance is only about 1 file per second.
How can we increase the performance significant?
No Events found!
holgerjakob_c0722c
2 Intern
•
337 Posts
0
January 6th, 2010 00:00
Hi
Your 60 Files per Minute translate to 1 per second. That is achievable with even a small system.
To influence performance and to not hit any object count limits too soon I suggest, you do the following:
FileNet
- Try to containierize with FileNet (create larger files that contain a number of your small files)
- if configurable set the number of threads to at least 20 (don't know what influences in FileNet how many threads are opened)
ApplicationServer
- set FP_OPTION_EMBEDDED_DATA_THRESHOLD to 102400
- set FP_OPTION_BUFFERSIZE to 102400
- Both of them are environment variables. They will ensure small files are contained in the CDF in order not to use too many object. The second will make sure you allocate enough buffer for performance reasons.
- This may already be available in the FileNet configuration for Centera
Centera:
Make sure you use storage Strategy Performance.
Best regards, Holger
holgerjakob_c0722c
2 Intern
•
337 Posts
1
January 6th, 2010 01:00
Hi
I have always been using just an environment variable not any functions elsewhere. Did you try to set an environment variable?
When looking at the API Reference, I see that FP_OPTION_EMBEDDED_DATA_THRESHOLD is listed unter FPPool_SetGlobalOption whereas FP_OPTION_BUFFERSIZE is listed under FPPool_SetIntOption.
You probably need to use a different setOption call to set the embedded data threshold.
Holger
nishiichinoe
12 Posts
0
January 6th, 2010 01:00
Thank you very much for your fast reply! I've tried that, so I added the following line to my code (FP_OPTION_BUFFERSIZE I used already):
thePool.setOption(FPLibraryConstants.FP_OPTION_EMBEDDED_DATA_THRESHOLD, 102400);
Unfortunately I got the following error:
com.filepool.fplibrary.FPLibraryException: FP_SetIntOption/GetIntOption: unknown option name
Do you know what am I doing wrong?
Best Regards
nishiichinoe
12 Posts
0
January 6th, 2010 01:00
Sorry, you're right. I needed to set the global option.
But unfortunately the performance is still the same like before (1 file per second).
I attached my source code, maybe I missed something.
Best Regards
1 Attachment
Connector.java
holgerjakob_c0722c
2 Intern
•
337 Posts
0
January 6th, 2010 02:00
Hi
I'm not really the Java guy and not the developer of our company :-( and all we do is .NET :-(
The general guidelines for such small files would be:
open the pool connection once and share it for all threads
make the number of threads configurable
get yourself a thread for each file you write (write files in parallel)
for small files it makes no sense to use blobwritepartial
Holger
gstuartemc
2 Intern
•
417 Posts
0
January 12th, 2010 05:00
Are you using a local cluster for this testing? You cannot use the public clusters to test expected performance - they are too remote and too heavily used.
I have quickly looked at your code and it looks OK.
nishiichinoe
12 Posts
0
January 12th, 2010 06:00
Thank you for looking at my code.
I'm using a local cluster in a local network with 400 Mbit/sec.
petcavage
2 Posts
0
May 26th, 2017 12:00
I was looking through posts on how to set environment variable FP_OPTION_BUFFERSIZE?
I do not understand how these are set / what command or file? we are using center sdk 3.4
Any help is appreciated , thank you
mfh2
208 Posts
0
May 30th, 2017 11:00
Hello petcavage -
You could set this using an environment variable in a script that is used to launch your application, or directly in code through the FPPool_SetIntOption() operation.
Regards,
Mike Horgan
petcavage
2 Posts
0
May 30th, 2017 13:00
Thank you VERY much Mike. Set through environment like you suggested and works (see the value updated) .
Now i understand.
Pete