Tuesday, 27 August 2013

Avamar and large dataset with multiples files

Here is a gem I found in the Avamar forums, from a "Ask the expert" session answered by Ian anderson concerning large file systems and Avamar. Please share your experiences if you have any.


lmorris99 wrote:

We have one server with 25 million files, scattered through directories six levels deep.
We'd like to throw it at our test Avamar grid; any tuning I should look at on the client (or server) side before we set it up for its first backup?

The most important thing to do on a client with so many files is to make sure that the file cache is sized appropriately. The file cache is responsible for the vast majority (>90%) of the performance of the Avamar client. If there's a file cache miss, the client has to go and thrash your disk for a while chunking up a file that may already be on the server.

So how to tune the file cache size?

The file cache starts at 22MB in size and doubles in size each time it grows. Each file on a client will use 44 bytes of space in the file cache (two SHA-1 hashes consuming 20 bytes each and 4 bytes of metadata). For 25 million files, the client will generate just over 1GB of cache data.

Doubling from 22MB, we get a minimum required cache size of:
22MB => 44MB => 88MB => 176MB => 352MB => 704MB => 1408MB

The naive approach would be to set the filecachemax in the dataset to 1500. However, unless you have an awful lot of memory, you probably don't want to do that since the file cache must stay loaded in memory for the entire run of the backup.

Fortunately there is a feature called "cache prefixing" that can be used to set up a unique pair of cache files for a specific dataset. Since there are so many files, you will likely want to work with support to set up cache prefixing for this client and break the dataset up into more manageable pieces.

One quick word of warning -- as the saying goes, if you have a hammer, everything starts to look like a nail. Cache prefixing is the right tool for this job because of the large dataset but it shouldn't be the first thing you reach for whenever there is client performance tuning to be done.

On to the initial backup.

If you plan to have this client run overtime during its initial, you will have to make sure that there is enough free capacity on the server to allow garbage collection to be skipped for a few days while the initial backup completes.

If there is not enough free space on the server, the client will have to be allowed to time out each day and create partials. Make sure the backup schedule associated with the client is configured to end no later than the start of the blackout window. If a running backup is killed by garbage collection, no partial will be created.

You will probably want to start with a small dataset (one that will complete within a few days) and gradually increase the size of the dataset (or add more datasets if using cache prefixing) to get more new data written to the server each day. The reason for this is that partial backups that will only be retained on the server for 7 days. Unless a backup completes successfully within 7 days of the first partial, any progress made by the backup will be lost when the first partial expires.

After the initial backup completes, typical filesystem backup performance for an Avamar client is about 1 million files per hour. You will likely have to do some tuning to get this client to complete on a regular basis, even doing incrementals. The speed of an incremental Avamar backup is generally limited by the disk performance of the client itself but it's important to run some performance testing to isolate the bottleneck before taking corrective action. If we're being limited by the network performance, obviously we don't want to try to tweak disk performance first.

The support team L2s from the client teams have a good deal of experience with performance tuning and can work with you to run some testing. The tests that are normally run are:
  • An iperf test to measure raw network throughput between client and server
  • A "randchunk" test, which generates a set of random chunks and sends them to the grid in order to test network backup performance
  • A "degenerate" test which, as I mentioned previously, processes the filesystem and discards the results in order to measure disk I/O performance
  • OS performance monitoring to ensure we are not being bottlenecked by system resource availability (CPU cycles, memory, etc.)
Edit -- 2013-08-06: The behaviour for partial backups changed in Avamar 6.1. More information in the following forums thread:
Re: Garbage collection does not reclaim expected amount of space

Friday, 26 July 2013

1,23996,CLI failed to connect to MCS

I tried to install a remote MCCLI instance as I do not like to work directly from the utility node, and I kept getting this error and could not find the problem.

Support told me running MCCLI from a remote host was not supported (of course it is).

When I finally looked up the config file "/root/.avamardata/6.1.1-87/var/mc/cli_data/prefs/mcclimcs.xml"
I noticed I had the "mcsaddr" filed wrong.

Make sure you don't have any configuration problems...

Symcli on Linux

Having problems making symcli work in a Server/Client (remote server) model ?

Try the following:
  1. Do a clean reinstall: rpm -ef symcli-data-V7.5.0-0.i386 symcli-master-V7.5.0-0.i386 symcli-symcli-V7.5.0-0.i386 symcli-thincore-V7.5.0-0.i386 symcli-64bit-V7.5.0-0.x86_64 symcli-cert-V7.5.0-0.i386 symcli-base-V7.5.0-0.i386 symcli-symrecover-V7.5.0-0.i386 symcli-master-V7.5.0-0.i386
  2. Resinstall: #> tar xvzf se7500-Linux-i386-ni.tar.gz
  3. Then you have to use the supplied install script: #> ./se7500_install.sh
  4. Make sure /usr/emc/API/symapi/config/netcnfg is properly configured
  5. Make sure the connection name defined in the above file is exported as an environment variable on your shell: #> export SYMCLI_CONNECT=SYMAPI_SERVER (in that case "SYMCLI_CONNECT" is the name of the first field in the netcfg file.
  6. Restart the daemon: #> stordaemon shutdown storsrvd && stordaemon start storsrvd
  7. Make sure it's running: #>  /opt/emc/SYMCLI/bin/symcfg list -service
  8. Check that the logs are clean:  #> /opt/emc/SYMCLI/bin/stordaemon showlog storsrvd
  9. Check the currentlly running deamons: #> /opt/emc/SYMCLI/bin/stordaemon list (storapid, storwatchd and stordrvd should be running)
  10. Export your path: #>export PATH=$PATH:/opt/emc/SYMCLI
  11. Test your config: #> symcfg list
  12. In last resort, read the following file found on powerlink (when all else fail read the instructions): "Solution Enabler 7.X installation guide)
Good luck !

Tuesday, 23 July 2013

VMAX Device types

TDAT:
TDAT, or thin data device, is an internal LUN which is assigned to a thin pool. Each TDAT LUN is comprised of multiple physical drives configured to provide a specific data protection type. An example of a TDAT device might be a Raid-5(3+P)LUN.
Another common example would be a Raid–1 mirrored LUN. Multiple TDAT devices are assigned to a pool.  When creating a thin pool LUNs for presentation to a host the data is striped across all of the TDAT devices in the pool. The pool can be enlarged by adding devices and rebalancing data placement (background operations with no impact to the host).

Thin Devices (TDEVs):
They consume no disk space. These are only pointers residing in memory. Allocation is done in 768 KB increment (12 tracks) when they are "bound". It  is a host accessible(redundantlypresented to an FA port) back-end LUN device that is “bound” to a thin device pool (TDATs) for its capacity needs.
As stated above, a TDEV is a host presentable device that is striped across the back end TDAT devices. The stripe size is 768K.
Each TDEV is presented to an FA port for host server allocation When utilizing thin provisioning, Thin Pool LUNs are employed. The utilization of TDEVs is required to use EMC ®Fully Automated Storage Tiering Virtual Provisioning (FAST/VP features.


Meta Devices (aka Meta Volumes:
They allow to increase the size of device presented to a host. Symmetrix Device size maximum is 240 GB,


Data devices:
Non adressable Symmetrix private device. (Cannot be mapped to a front end port)
Provide space to thin devices.
You "add" data devices to thin pool, but you "bind" thin devices to a thin pool.

Avamar support

Here is a list of commands Avamar support use to diagnose a problem on the utility node:

ssh to a storage node:
ssn 0.8

Using /usr/local/avamar/var/probe.xml
ssh -x admin@192.168.255.10  ''
su - admin
ssh-agent bash
ssh-add .ssh/dpnid

Send a command to each and every one of the storage nodes (mapall)
mapall --noerror 'grep -i "error" /var/log/messages*'

cd proactive_check/
head hc_results.txt
chmod +x proactive_check.pl
./proactive_check.pl
mccli event show --unack=true| grep Module
status.dpn


Bios version:
 mapall --noerror --all 'omreport system summary |grep -A4 BIOS'

Wednesday, 15 May 2013

Avamar nodes Gen4s hardware

Avamar recently introduced the gen4s line to it's hardware based grids offering.
Here is the information I was able to gather so far regarding gen 4s hardware:

  • They have discontinued 1.3tb and 2.6Tb Nodes, they only make 2Tb, 3.9Tb and 7.8 Tb Nodes. The nodes are now manufatured by Intel, not Dell.
  • All Gen4S nodes require Avamar software version 6.1 SP1 or later
  • Gen4S nodes are compatible with existing Gen4 systems (provided nodes are running a compatible Avamar software version.



Hardware Specifications

M600 (2.0 TB licensed capacity)
Six 3.5” hard drives
Dual 750W power supplies
Eight 10/100/1000baseT GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system

M1200 (3.9 TB licensed capacity)
Six 3.5” hard drives
Dual 750W power supplies
Eight 10/100/1000baseT GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system

M2400 (7.8 TB licensed capacity)
Twelve 3.5” hard drives
One 2.5” SSD drive (internal mounting)
Dual 750W power supplies
Eight 10/100/1000baseT GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system

Avamar Business Edition/S2400 node (7.8 TB licensed capacity)
Available as single node server only
Eight 3.5” hard drives
One 2.5” SSD drive (internal mounting)
Dual 750W power supplies
Eight 10/100/1000baseT GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system
No replication required

Extended Retention ADS Gen4S Media Access node
Twelve 3.5” hard drives
One 2.5” SSD drive (internal mounting)
Dual 750W power supplies
Eight 10/100/1000baseT GbE ports
Two 8 Gbps fiber channel ports
Two 10 GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system
Note:
For installation instructions and hardware specifications for the Gen4S Media
Access node, see the

Avamar 6.1 Extended Retention Media Access Node Customer 
Hardware Installation Guide
(P/N 300-013-367).
ADS Accelerator node
Two 2.5” hard drives
Dual 750W power supplies
Four 10/100/1000baseT GbE ports
RMM4 management port
Avamar service port (hardware: NIC8, software: eth7)
SuSE Linux Enterprise Server v11 sp1 operating system


Source: "EMC Avamar Datastore Gen4S Single Node Customer Installation Guide"
P/N 300-999-651

Sunday, 28 April 2013

VMAX Links

Eric Stephani's VMAX Training summary:
http://ericstephani.com/?p=230

IBM SVC manual explanation regarding VMAX devices type:
http://pic.dhe.ibm.com/infocenter/svc/ic/index.jsp?topic=%2Fcom.ibm.storage.svc.console.doc%2Fsvc_symmetrixcontlucreation_1ev5ds.html


VMAX architecture explained:
http://www.emcsaninfo.com/2012/11/emc-vmax-architecture-detailed-explanation.html