Here is a gem I found in the Avamar forums, from a "Ask the expert" session answered by Ian anderson concerning large file systems and Avamar. Please share your experiences if you have any.
lmorris99 wrote:
We have one server with 25 million files, scattered through directories six levels deep.
We'd like to throw it at our test Avamar grid; any tuning I should
look at on the client (or server) side before we set it up for its first
backup?
The
most important thing to do on a client with so many files is to make
sure that the file cache is sized appropriately. The file cache is
responsible for the vast majority (>90%) of the performance of the
Avamar client. If there's a file cache miss, the client has to go and
thrash your disk for a while chunking up a file that may already be on
the server.
So how to tune the file cache size?
The
file cache starts at 22MB in size and doubles in size each time it
grows. Each file on a client will use 44 bytes of space in the file
cache (two SHA-1 hashes consuming 20 bytes each and 4 bytes of
metadata). For 25 million files, the client will generate just over 1GB
of cache data.
Doubling from 22MB, we get a minimum required cache size of:
22MB => 44MB => 88MB => 176MB => 352MB => 704MB => 1408MB
The
naive approach would be to set the filecachemax in the dataset to 1500.
However, unless you have an awful lot of memory, you probably don't
want to do that since the file cache must stay loaded in memory for the
entire run of the backup.
Fortunately
there is a feature called "cache prefixing" that can be used to set up a
unique pair of cache files for a specific dataset. Since there are so
many files, you will likely want to work with support to set up cache
prefixing for this client and break the dataset up into more manageable
pieces.
One
quick word of warning -- as the saying goes, if you have a hammer,
everything starts to look like a nail. Cache prefixing is the right tool
for this job because of the large dataset but it shouldn't be the first
thing you reach for whenever there is client performance tuning to be
done.
On to the initial backup.
If
you plan to have this client run overtime during its initial, you will
have to make sure that there is enough free capacity on the server to
allow garbage collection to be skipped for a few days while the initial
backup completes.
If
there is not enough free space on the server, the client will have to
be allowed to time out each day and create partials. Make sure the
backup schedule associated with the client is configured to end no later
than the start of the blackout window. If a running backup is killed by
garbage collection, no partial will be created.
You
will probably want to start with a small dataset (one that will
complete within a few days) and gradually increase the size of the
dataset (or add more datasets if using cache prefixing) to get more new
data written to the server each day. The reason for this is that partial
backups that will only be retained on the server for 7 days. Unless a
backup completes successfully within 7 days of the first partial, any
progress made by the backup will be lost when the first partial expires.
After
the initial backup completes, typical filesystem backup performance for
an Avamar client is about 1 million files per hour. You will likely
have to do some tuning to get this client to complete on a regular
basis, even doing incrementals. The speed of an incremental Avamar
backup is
generally limited by the disk performance of the
client itself but it's important to run some performance testing to
isolate the bottleneck before taking corrective action. If we're being
limited by the network performance, obviously we don't want to try to
tweak disk performance first.
The
support team L2s from the client teams have a good deal of experience
with performance tuning and can work with you to run some testing. The
tests that are normally run are:
- An iperf test to measure raw network throughput between client and server
- A
"randchunk" test, which generates a set of random chunks and sends them
to the grid in order to test network backup performance
- A
"degenerate" test which, as I mentioned previously, processes the
filesystem and discards the results in order to measure disk I/O
performance
- OS performance monitoring to ensure we are not being bottlenecked by system resource availability (CPU cycles, memory, etc.)
Edit -- 2013-08-06: The behaviour for partial backups changed in Avamar 6.1. More information in the following forums thread:
Re: Garbage collection does not reclaim expected amount of space