Ticket ID: SIXXS #1228171 Ticket Status: Resolved PoP: uschi02 - Your.Org, Inc. (Chicago, Illinois)
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Shadow Hawkins on Thursday, 15 October 2009 18:41:36
My tunnels do not appear to be passing traffic, possibly since about 10-12-2009 09:41:04 EDT (says my nagios install).
When running "aiccu test", the tests pass until the ping across the ipv6 tunnel. That is, all IPv4 tests works and the local IPv6 pings work (both to ::1 and my tunnel endpoint), but the remote tunnel endpoint ping fails.
I can debug from tunnel T18374 as it is local to me; I don't have remote access to the machine running tunnel T23344 other than through the IPv6 tunnel, but pings to it from another v4-to-v6 endpoint fails currently so I'm assuming it is having a similar problem, but it's not for certain.
-Doug
ADMINEDIT: Original Subject: Problems with tunnels T18374 and T23344 for host uschi02
State change: confirmed
Jeroen Massar on Thursday, 15 October 2009 18:41:40
The state of this ticket has been changed to confirmed
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Jeroen Massar on Thursday, 15 October 2009 18:44:25
Dynamic and Heartbeat tunnels seem to be affected at the moment and they won't update their endpoints/pass traffic. Static tunnels are fine.
Folks at your.org (who are in the middle of a big renumbering event, thus are quite busy atm) are looking into it.
"Me too messages" marked hidden so that the ticket retains its overview.
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Shadow Hawkins on Sunday, 25 October 2009 03:22:47
Just to understand a bit more... I am pretty sure (in fact I am positive -- I am looking at the tcpdump traffic from 2h ago) my tunnel was passing traffic until a few hours ago when I rebooted my router.
It was only then when my router tried to re-establish my tunnel that I discovered that uschi02 has been down for what looks like a week or more.
Is this possible? Is the actual routing of traffic somehow independent of PoP status?
I guess the lesson is to check the PoP status before a reboot and if it's down, avoid the reboot if at all possible until the PoP is back up.
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Shadow Hawkins on Saturday, 31 October 2009 19:33:53
The RAID controller on the POP experienced a very weird failure where certain reads would cause it to lock up. The kernel was running fine, but the userland applications were frozen waiting on disk reads that never came back.
So, as long as the tunnel wasn't deleted it looks like the kernel was still forwarding properly. As soon as your end dropped the tunnel and tried to recreate it, that required a disk read of some sort, which never came back.
This is a very unusual failure, so I wouldn't worry about planning around it happening again. The new RAID card is backordered, so we've moved the POP to a new server. It should be back up shortly.
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Jeroen Massar on Tuesday, 20 October 2009 10:01:37
Update: RAID controller has broken down, awaiting for replacement.
Dynamic Tunnels (AYIYA/heartbeat) on uschi02 misbehaving
Jeroen Massar on Monday, 02 November 2009 14:31:49
All re-installed and up and running again. Enjoy.
State change: resolved
Jeroen Massar on Monday, 02 November 2009 14:34:28
The state of this ticket has been changed to resolved
Posting is only allowed when you are logged in. |