This post is more than 5 years old
8 Posts
0
144497
June 15th, 2015 09:00
Frozen Replays
Hello, I'm new to Compellent Storage and with my new position was asked to "figure out" our replay situation.
We have quite a few volumes set up and about half of them have old, expired replays from 3 or 4 years ago. In System Manager most just say:
State: Frozen
Source: Created by user
Create volume: {deleted volume}
Some say source as "created by schedule" and actually have a volume listed under Create Volume.
My understanding is that several years ago there was some kind of issue and they went thru and created new volumes from these replays and then attached those volumes to servers and they're now being used as new volumes. That's what someone told me anyway.
So I have several questions:
1) Is there a way to tell where these replays are now being used as active volumes?
2) Is there a way to get rid of the expired replays so we can reclaim the storage, without effecting anything - specifically the new volumes that were created off them?
3) We replicate most of our LUNs to a different data center and most of the time the replays get stuck and we get way behind on the replication. Looks like replication is only using about 50megs of our 100meg pipe, even tho there's not much else using the replication link. Is there some other way we should be configuring things? Right now we have our replay schedule to go once per hour.
Thanks for any and all help. I plan on calling support, but would like to be more armed with knowledge and have stuff to look at on my side before calling them
BVienneau
115 Posts
1
June 16th, 2015 07:00
If the volume was recovered from a Replay and continues to technically run from that particular Replay then the Replay itself will look like in the GUI that it is still there and will never "expire" unless you unmapped and then deleted the "view" of that Replay, but since it is considered your active volume it will stay looking like that always. It's not keeping any extra data around since the data that is a part of that Replay is the active data on the server.
The only way I know that you can clean up the look of the volume would be to create a new volume of the exact size and do a CMM (copy mirror migrate) of the old view volume into the new volume, then delete the view volume, and then it would look "clean" in the GUI. Either option will net you the same amount of used disk space in the end though so it may not be worth the effort and you would need 2x the original disk space while it's doing the CMM process. If you don't know how to do the CMM process Copilot can walk you through it. It's a pretty straight forward process.
BVienneau
115 Posts
0
June 16th, 2015 08:00
yes, it should be less than 1 TB. In Enterprise Manager if you go to the Storage Tab, then click on the volume, you can see the total amount of space a volume is used and where it's being used Active/Replay and Total (actual total with RAID overhead, etc.)
The GUI used to show where the Replays are being used on active volumes by default but they cleaned it in a more recent code version. you can see it still though by changing the view. On the volume's Replay tab, change the "Set Replay View" to "show view volume tree". There will be a line from the Replay over to the active volume in that view and it'll show the volume's name you assigned it.
If you are not maxing out the pipe, I'd first check the QOS setting of the Replications. Make sure that you have the Link speed set to the actual link speed and then check if you have any bandwidth limit's set on it. (right click, edit bandwidth limits). If those are set correctly, you may need to play with some of the advanced settings with guidance from Copilot. As always, latency, etc. will affect the traffic but you should be able get close to the pipe limit if latency is low and Replications are tuned right.
Other tweaks for Replications is to not snap everything all at once if possible (obviously do server volumes/applications that need to recover at exact same moment at the same time). It could be that starting them all at the same time like turning on a fire hose will not work as efficient as staggering the start times. Copilot should be able to assist here as well if they can take a look at the rate of change per volume, etc.
DELL-Sam L
Moderator
•
7.6K Posts
0
June 15th, 2015 13:00
Hello JJCSGI,
What are you using to manage your Compellent SAN? If you are using EM (Enterprise Manger) then you can look at the replay volumes and see how much space is being used and how much is left free. Also what is the current version of SCOS that is running on your Compellent San? Also you can reclaim space that is used by your replays as well in EM. What profile are you running on your replay’s?
Please let us know if you have any other questions.
JJCSGI
8 Posts
0
June 15th, 2015 14:00
We're using EM.
The problem is that I don't see any replay volumes in EM. All the volumes look like normal volumes to me, except for the fact that we have a bunch of expired replays.
EM is v 6.4.5. I don't know where to check for any other versions.
We're doing an hourly replay, that expires at 62 minutes
JJCSGI
8 Posts
0
June 15th, 2015 14:00
I've been using both Enterprise Manager and Storage Manager.
Between all the old replays that have expired, but still showing, It's using about 1TB.
I don't see any way to get rid of the old replays. They're all expired (most for about 3 years).
The replay profile we're suing is "daily every 1 hour) and expires at 62 minutes. Using the Standard creation method.
I honestly don't know where to check version of SCOS. My version of System Manager is 6.4.5, fwiw
JJCSGI
8 Posts
0
June 16th, 2015 07:00
So that kind of helps with #2. Let's say a replay is showing in the GUI that it's using 400GB. And the volume that was taken from that replay at one point has been expanded and is now using 600GB. Are you saying that our total disk usage is actually 600GB? And not the 1TB it looks like?
It also doesn't really help with the root questions:
1) Is there a way to tell where these replays are now being used as active volumes? I understand you can do the copy/mirror/migrate on the volume and that would get rid of the replay, but I need to know which volume that needs to happen to.
3) We replicate most of our LUNs to a different data center and most of the time the replays get stuck and we get way behind on the replication. Looks like replication is only using about 50megs of our 100meg pipe, even tho there's not much else using the replication link. Is there some other way we should be configuring things? Right now we have our replay schedule to go once per hour.
DELL-Sam L
Moderator
•
7.6K Posts
0
June 16th, 2015 08:00
Hello JJCSGI,
I am going to send you an email with a link that you can download the SCOS (Storage Center System) Administrator Guide 6.0. Once you have the guide downloaded if you look on page 290 it shows you how to expire your old replays. Also I will include the EM 2015 guide as well & if you look on page 113 it shows you how to manually expire replays.
Please let us know if you have any other questions.
JJCSGI
8 Posts
0
June 16th, 2015 08:00
Thank you, I think I understand what you're all saying. Looking at one of the drives with replays in question, here's what I see:
How would I interpret this?
Replay from 10/19/2012 was used to create the volume on 8/26/2014, and then again to produce the volume on 6/16? This one has 3 frozen replays.
I'll investigate the QOS settings. Looks like our link is coded in EM to 750 megabits and from our graphs we can see it never uses more than 50MBps. With QOS and no other traffic it should be using the full pipe. We've cancelled all replication temporarily cause we're trying to sync a new volume to the other data center.
JJCSGI
8 Posts
0
June 16th, 2015 09:00
To make it clearer, we have a 100MBps pipe, the QOS is set to 750 megabits (93.75 MBps) and we're using approximately 50MBps at any given time.
mtanen
118 Posts
0
June 16th, 2015 11:00
Is there any QoS setup on the lines or routers? Anything that would throttle? When you change the QoS on the Compellent systems do you see a change in real time?
mtanen
118 Posts
0
June 16th, 2015 11:00
What is the latency and loss on the line between sites?
JJCSGI
8 Posts
0
June 16th, 2015 14:00
We've got 100MBbs of a 1GB pipe allocated for replication with a 32ms latency between sites.
None of our equipment has any QOS settings configured, afaik.
I got off the phone with support and the only suggestion he really had was to change the tcp window size
JJCSGI
8 Posts
0
June 16th, 2015 15:00
Well, heck, I changed our QOS settings from "Other - 750 megabits" to just using the built in 1 gigabit per second and our transfer speeds literally quadrupled from 50 mbps to 200 mpbs. Makes no rhyme or reason to me, but I'll take it.