iSCSI initiator connection failure.

Hi gurus,

I need a help ASAP, i'm almost crazy !!!!!

I have a PS6010XV and PS6010E connect to an XenServer 5.6 SP2 but sometimes, i think when there is many I/O request i lost the connection to the volumes. I receive this kind of info/error/warnings

I have 2 bonds nics of 10GB on xenserver and i have multipath active on xenserver.
On dell box i'm already unccheck the enable load balancing on the pools, because with this setting the problems occurs more times.

On the Box Group

Level: INFO
Time: 20-10-2011 18:46:52
Member: PTALFEQLMN02 (always with this member - PS6010E)
Subsystem: MgmtExec
Event ID: 7.2.15
iSCSI session to target '192.168.110.13:3260, iqn.2001-05.com.equallogic:0-8a0906-868c3d70a-7d1000000744e0b3-ptalffps01-lvm01-data-01' from initiator '192.168.110.20:47175, iqn.2011-06.com.example:a46c8099' was closed.
iSCSI initiator connection failure.
Reset received on the connection.

On the XenServer

Oct 20 18:38:22 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 19669327808
Oct 20 18:38:22 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 20545722848
Oct 20 18:38:22 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 20822738408
Oct 20 18:46:54 PTALFXEND01 kernel: connection4:0: ping timeout of 15 secs expired, recv timeout 10, last rx 1069993397, last ping 1069994397, now 1069995897
Oct 20 18:46:54 PTALFXEND01 kernel: connection4:0: detected conn error (1011)
Oct 20 18:46:54 PTALFXEND01 kernel: device-mapper: multipath: Failing path 8:64.
Oct 20 18:46:54 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 188600
Oct 20 18:46:54 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 6831978037
Oct 20 18:46:54 PTALFXEND01 kernel: end_request: I/O error, dev dm-7, sector 13582433048

Responses(3)

DJ

DELL-Joe S

7 Technologist

•

729 Posts

0

October 21st, 2011 07:00

robola,

What the array is saying is that the host initiator at '192.168.110.20, closed the connection. Test your network and FW versions.

1: Ensure have the latest approved NIC/HBA drivers installed (check with Citrix)

2: You are running the latest version of the Array Firmware
If using: v5.1.x use: 5.1.2 (just released today); v5.0.x use 5.0.8; v4.3.x use 4.3.8

3: Verify ping and traceroute from each member in the Group and each Xen interface. Note, you need to test all interface combinations.
Telnet /SSH to each member (ensure you use a physical IP and not the group IP)
Usage: ping "-I "
The sourceIP is the IP address of a specific array ETH port. This is done from a group prompt after logging into the array. The quotes are needed from the group prompt.
Note: the above is a Capital eye "I", ping without -I will only use eth0

Example:
From MemberA
Eth0_IP to MemberB eth0_IP
Eth0_IP to MemberB eth1_IP
Eth0_IP to MemberB eth2_IP
Eth0_IP to Xen interface_1
Eth0_IP to Xen Interface_2
Eth1_IP to MemberB eth0_IP
Eth1_IP to MemberB eth1_IP
Eth1_IP to MemberB eth2_IP
Eth1_IP to Xen interface_1
Eth1_IP to Xen Interface_2
Etc..
Then do each eth interface to each Xen interface (1 &2)
Repeat the process from MemberB

GrpName>support traceroute “-s [ETH port source IP] [destination IP to traceroute to]”
(observe the placement of the -> “ “
Note, do this for each array eth interface to each array destination interface, also include all Xen interfaces.

5: Ping/Traceroute from the SAN switches to all interfaces (both xen & array).

If you have any failures, check your network setup.

Regards,
Joe

R

rebola

2 Posts

0

October 21st, 2011 09:00

Hi Joe,

i have been test and everything likes fine !!! The only thing that i not understand is sometimes i receive a ttl with 10.000 ms on all interfaces

64 bytes from 192.168.110.20: icmp_seq=31 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=32 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=33 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=34 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=35 ttl=64 time=10.000 ms
64 bytes from 192.168.110.20: icmp_seq=36 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=37 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=38 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=39 ttl=64 time=0.000 ms
64 bytes from 192.168.110.20: icmp_seq=40 ttl=64 time=0.000 ms

The traceroute likes fine too

PTALFEQLGN01> support traceroute "-s 192.168.110.14 192.168.110.11"
You are running a support command, which is normally restricted to PS Series Tec
hnical Support personnel. Do not use a support command without instruction from
Technical Support.
cli-child to 192.168.110.11 (192.168.110.11) from 192.168.110.14, 30 hops max, 40 byte packets
1 192.168.110.11 (192.168.110.11) 0.000 ms 0.000 ms 0.000 ms

From our SAN switchs i can't ping because the ISCI network its different from our switchs.

Our firmware versions is

Name Status Version Disks Capacity FreeSpace Connections
---------- ------- ---------- ----- ---------- ---------- -----------
PTALFEQLMN online V5.0.4 (R1 16 24.86TB 18.44TB 2
02 56082)
PTALFEQLMN online V5.0.4 (R1 16 7.32TB 1.69TB 2
01 56082)

JonathanSE

10 Posts

0

September 10th, 2014 07:00

Hello.

I know it's an old post, but I have a similar issue (same log messages) with my iSCSi connection between a RHEL (release 6.2 kernel 2.6.32) box and EQL array (FW 5.2.1). Did you finally fix it ?

Regards,

Jon

View All

No Events found!

FluidFS

iSCSI initiator connection failure.

Was this post helpful?