I thought I was done with TSM. And just when you think you are out, they pull you back in.
Only in this case I am being pulled back in by some feedback from a colleague on my previous column. I didnt want to relegate his input to a comment where it wouldn’t get adequate notice, so I am reproducing it here in full and unedited. Full credit for this goes to Mario Correia. Mario has some impeccable credentials as a TSM administrator and consultant, and worked with one of the most respected TSM services groups in the industry. Mario had some great feedback on how often to run TSM reclamation on a Data Domain device.
“You really hit all the points with TSM and DD but I disagree on the comment about more aggressive reclamation at the 20%-30% thresholds.
Typically, 50% is the magic number where you hit the point of diminishing returns. If a tape is 50% utilized, I can take two volumes and consolidate onto one. I still have a net gain of 1 volume. But, once I go beyond that point, I’m not really saving anything from a volume perspective plus I’m also working much harder to get that minimal savings. When you throw in the actual space savings post-dedupe, it’s even less.
If we do the math using 100GB volume size, at 20% reclaim, I’d have to read data from 5 volumes or 400GB of data to reclaim that 100GB that was ‘wasted’. But if I get 10:1, that amounts to having TSM read 400GB of data to save 10GB of physical space on the DD. TSM has to read the full amount of “active” pre-comp data on that volume so it works just as hard as if it were reading from real tape.
The other potential gotcha, potential is the key word, is that TSM stores individual files in bundles called aggregates. When reclamation runs, TSM rebuilds these aggregates to free up space from expired files. Think of compacting your .pst file. I don’t have any numbers on this, but I would be concerned that we would be chipping away at our de-dupe rates because we would have TSM aggressively rebuilding these aggregates. Granted we would do better because of variable block, but I’m not sure it’s worth the risk.
My recommendation would be to use the typical 70-90% as my reclamation threshold (I think DD best practice is 90%), and just add some buffer onto your DD sizing to make up for those inefficiencies.
“Basically, I agree with everything Mario wrote — with the possible caveat that if you have TSM cycles to burn, you might want to turn up the wick a bit on reclamation beyond what he suggests.
On the other hand, there is something else interesting that happens here (which Mario alludes to at one point). Strictly speaking, TSM reclamation shouldn’t reclaim all that much physical space on a deduplicated device. Due to the way TSM progressive incremental works, there is not a lot of net new data introduced (post deduplication). So if I am doing reclamation, and cleaning up objects because there are newer versions, we need to recognize that the newer version may only introduce 10% new segments or less. So when we reclaim that file, we dont reclaim the entire capacity associated with it, we only reclaim the amount that represents unique segments–10% in this example.
So from that perspective, more aggressive reclamation doesn’t achieve the same return on investment when used with deduplicated Data Domain as it does non-deduplicated storage, be it physical tape or virtual tape. Which, I suppose, is just another argument for taking a more moderate approach to reclamation like Mario suggests–because you will just be burning CPU and IO with very marginal returns in terms of physical capacity reclaimed.