Unsolved
This post is more than 5 years old
2 Intern
•
306 Posts
0
1092
September 10th, 2013 08:00
I have a case open ... but...
I'm getting thousands of these messages soon after I starte a File System Replicatoin
Sep 10 08:50:36 2013 DART:REP:WARNING:22 Slot 2:
1378817434:
Source=4514_APMXYZ_123_0000(alias=anyones_fs_rep), transferring. Data connection down. Source is retrying.
Data connection down. Source is retrying.
I deleted the target FS and its checkpoints in order to copy it all over again.
Once I start the replication session, the data movement looks good. Then soon, (within an hour) the replication comes to a halt and those messages begin
Max Out of Sync Time (minutes) = 10
Next Transfer Size (KB) = 2351954984
Current Transfer Size (KB) = 2351954984
Current Transfer Remain (KB) = 2347008584
Estimated Completion Time = Sun Nov 03 00:11:01 EDT 2013
Current Transfer is Full Copy = Yes
Current Transfer Rate(KB/s) = 0
Current Read Rate(KB/s) = 0
Current Write Rate(KB/s) = 0
Previous Transfer Rate(KB/s) = 0
Previous Read Rate(KB/s) = 0
Previous Write Rate(KB/s) = 0
Average Transfer Rate(KB/s) = 0
Average Read Rate (KB/s) = 0
Average Write Rate (KB/s) = 0
Has anyone seen this issue and resolved it?
christopher_ime
2K Posts
0
September 13th, 2013 17:00
First thought is possibly you are trying to spray a "firehose down a straw" and the system overcompensates as it works to negotiate the speed between sites (at least this is how I understand the best practice).
When configuring replication, I will always enter at least one "Interconnect Bandwidth Schedule" and not leave it blank. If the client has T1 for instance (1.544Mbps), I'll at least have one entry such as:
1) Bandwidth (Kbits/sec): 1544
NOTE: I use the "network" multiplication of 1000 and not 1024
2) Check: "All Days"
3) Check: "All Hours"
This is in other words (while theoretical) "unlimited" for a T1 connection versus leaving it blank.
Can you try the same if you haven't already?
DHoffman2
2 Intern
•
306 Posts
0
September 16th, 2013 05:00
Well, the problem turned out being our network intrusion software called "Tipping Point" and Kudo's to EMC for coming to that determination. But this was ONE WIERD issue. 14 other replication working fine, 1 fails.
Once they opened up the software (not imposing any policiies) the replication picked up and finished over the weekend.