TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Shadow Hawkins on Monday, 14 December 2015 20:29:38
Hey,
my setup: Internet PPPoE AVM Fritzbox router Linux router with aiccu test machines
Tunnel T127214, routed subnet R225402
I'm currently trying to debug why I cannot access https://dot.kde.org and other KDE sites from test machines within a routed subnet (2001:6f8:900:9029::/64) while I can access them from the tunnel endpoint this subnet is routed to (2001:6f8:900:1029::2). Access to other IPv6 enabled sites like http://www.heise.de or https://www.google.com works just fine from both the routed test machines and the tunnel endpoint machine.
My Fritzbox does PPPoE to my ISP, but it doesn't do IPv6. Inside the private 192.168.191.0/24 I have a Linux router running aiccu + radvd. The Linux router has a bridge, br0, where radvd announces the routed subnet that aiccu provides. aiccu's own traffic is also routed via br0. It's a bridge as I used to run a couple of KVM machines on that machine as well but they're currently all turned off.
So basically br0 contains is a single ethernet device. Here's how it looks to ip:
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 00:1b:21:6a:bf:49 brd ff:ff:ff:ff:ff:ff
inet 192.168.191.4/24 brd 192.168.191.255 scope global br0
valid_lft forever preferred_lft forever
inet6 2001:6f8:900:9029:21b:21ff:fe6a:bf49/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::21b:21ff:fe6a:bf49/64 scope link
valid_lft forever preferred_lft forever
10: sixxs: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1400 qdisc fq_codel state UNKNOWN group default qlen 500
link/none
inet6 2001:6f8:900:1029::2/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::4f8:900:1029:2/64 scope link
valid_lft forever preferred_lft forever
My test machine is e.g. 2001:6f8:900:9029:a00:27ff:fe35:8362/64 (another Linux machine).
Now I run "curl -6 https://dot.kde.org". This gets stuck right after "connection established" on 2001:6f8:900:9029:a00:27ff:fe35:8362, but it downloads the whole index.html if run from 2001:6f8:900:1029::2 itself.
I've uploaded two pcap dumps from aforementioned br0 interface:
https://www.bunkus.org/misc/ipv6-dot.kde.org-trouble.7z
Any idea how to debug this further? Looking at Wireshark for the better part of two hours hasn't enlightened my yet :( I'll gladly provide any information you may deem necessary.
Note that I have a second tunnel with a second routed subnet (T137314 with R236074). I can request dot.kde.org from a machine within that second routed subnet, it's a similar setup overall.
I'd highly appreciate any insight. Thanks in advance.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Jeroen Massar on Monday, 14 December 2015 21:01:03 TCP connection stuck
That always hints heavily to Path MTU issues: some node dropping ICMP(v6) Packet Too Big packets.
why I cannot access https://dot.kde.org
A traceroute6 to 2a02:e980:1f::67 leads to nowhere from quite a few hosts. They likely have routing issues which might cause return packets to go missing.
10: sixxs: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1400
1400? What is that for a magic MTU setting?
You might want to start by verifying the correct MTU setting and configuring it properly. See the FAQ for details. Tracepath from your tunnel endpoint to the PoP is a good start here. After that, check with tracepath6 towards your IPv6 destination if all hops are playing nice and sending back ICMPv6 PTBs.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Shadow Hawkins on Monday, 14 December 2015 22:26:11
Hey Jeroen,
thanks for the reply.
That always hints heavily to Path MTU issues: some node dropping ICMP(v6) Packet Too Big packets.
I thought as much, but as requests from the the other routed subnet from the other tunnel I'm using (both at the same POP) are fine I thought I'd ask here.
A traceroute6 to 2a02:e980:1f::67 leads to nowhere from quite a few hosts. They likely have routing issues which might cause return packets to go missing.
That's interesting, thanks.
1400? What is that for a magic MTU setting?
Mostly a result of some earlier troubleshooting from way back when.
I've given 1428 a try, however that got me into some real PMTU issues on all of my machines (including the tunnel endpoint). Even though tracepath6y reported 1428 requesting content from sites that worked before (www.heise.de) stopped working.
Makes sense: I'm using PPPoE. Therefore I've now set my MTU to 1420: ethernet - PPPoE (8) - IPv4 (20) - UDP (8) - AYIYA (44); followed by a restart of aiccu. "ip link show dev sixxs" shows MTU 1420 having been set. "tracepath deham01.sixxs.net" shows 1492, so yes, normal PPPoE in play without anything else.
The sites that used to work before (google.com, heise.de) continue to work both from a routed machine as well as from the tunnel endpoint machine with 1420. tracepath6 from my tunnel endpoint machine confirms a maximum path MTU of 1420 (tried this both with my PoP as well as www.heise.de):
[0 root@sweet-chili ~] tracepath6 deham01.sixxs.net
1?: [LOCALHOST] 0.056ms pmtu 1420
1: gateway 24.671ms
1: gateway 24.309ms
2: 2001:6f8:862:1::c2e9:c729 23.951ms reached
Resume: pmtu 1420 hops 2 back 1
[0 root@sweet-chili ~] tracepath6 www.heise.de
1?: [LOCALHOST] 0.039ms pmtu 1420
1: gateway 24.164ms
1: gateway 23.922ms
2: 2001:6f8:862:1::c2e9:c729 31.297ms asymm 1
3: 2001:6f8:862:1::c2e9:c72c 42.938ms asymm 2
4: te0-0-2-3.c350.f.de.plusline.net 37.064ms asymm 8
5: 2a02:2e0:11:17:c::301 34.959ms asymm 9
6: te2-4.c102.f.de.plusline.net 83.387ms asymm 9
7: 2a02:2e0:3fe:0:c::1 85.710ms !A
Resume: pmtu 1420
My original problem is still present with 1420, though: "curl -6 https://dot.kde.org" works just fine from my tunnel endpoint machine but not from my routed machine.
Anything else I could try? Or is it more likely to be a problem on their end/with a machine in between? Thanks.
Kind regards,
mosu
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Jeroen Massar on Tuesday, 15 December 2015 07:49:19 I've given 1428 a try
With MTU it is not about trying, it is about using the correct setting.
Even though tracepath6y reported 1428 requesting content
Tracepath6 is irrelevant when the IPv4 MTU, over which the IPv6 packets are being sent, is misconfigured.
Makes sense: I'm using PPPoE. Therefore I've now set my MTU to 1420: ethernet - PPPoE (8) - IPv4 (20) - UDP (8) - AYIYA (44);
There is no "default MTU for PPPoE", you actually have to look at the IPv4 path.
followed by a restart of aiccu.
As per the FAQ on the MTU subject, you also have to update the PoP using the webinterface to match the correct MTU value, otherwise the PoP will send too large packets.
There is a big reason why it defaults to 1280, as that should always work.
(and if 1280 is too large, then you cannot tunnel IPv6 packets).
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Shadow Hawkins on Tuesday, 15 December 2015 19:04:52
Hey,
Tracepath6 is irrelevant when the IPv4 MTU, over which the IPv6 packets are being sent, is misconfigured.
I see. However, as shown above tracepath shows the IPv4 PMTU to my PoP is 1492 so eight bytes overhead for PPPoE should indeed be correct in my case. And therefore this calculation should ideally still be correct or not?
Makes sense: I'm using PPPoE. Therefore I've now set my MTU to 1420: ethernet - PPPoE (8) - IPv4 (20) - UDP (8) - AYIYA (44); As per the FAQ on the MTU subject, you also have to update the PoP using the webinterface to match the correct MTU value, otherwise the PoP will send too large packets.
That's what I meant. When I say "I changed the MTU" I meant that I'm using sixxs.net's tunnel information page, enter the new MTU there, hit the "Change button", wait a couple of seconds and then I restart aiccu. I always let aiccu set my interface's MTU, I never change it manually in order to avoid discrepencies between my PoP's and my own configuration. Anyway:
There is a big reason why it defaults to 1280, as that should always work. (and if 1280 is too large, then you cannot tunnel IPv6 packets).
Yeah, I get that. I used to use 1280 in the past but switched to something higher in order to get more throughput. Now I'm taking your advice and reverting to 1280 and lo and behold, I can access dot.kde.org from my routed machine now.
So consider this topic closed, and thanks for the insight.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Jeroen Massar on Wednesday, 16 December 2015 11:09:03 And therefore this calculation should ideally still be correct or not?
It is very likely that 1492 is correct and that this is just an issue in the network towards those kde.org hosts that they simply do not handle !1500 MTU packets as they are dropping ICMPv6.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Shadow Hawkins on Tuesday, 15 December 2015 19:40:46
Hey,
*sigh*
Reports of success with MTU of 1280 were premature. Even with 1280 dot.kde.org does not work from a machine in the routed subnet. I got confused by my proxy settings when I said that it works now.
Still not solved, but hopefully the problem is not on my end.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Jeroen Massar on Wednesday, 16 December 2015 11:10:34
Moritz Bunkus wrote:
Hey,
*sigh*
Reports of success with MTU of 1280 were premature. Even with 1280 dot.kde.org does not work from a machine in the routed subnet. I got confused by my proxy settings when I said that it works now.
Still not solved, but hopefully the problem is not on my end.
If they have broken Path MTU discovery, which is likely as multiple hops are dropping ICMPv6 (tracepath6 shows that quite well), then any link that is not 1500 will be broken and it will just be magic if packets do flow properly.
Only way to solve this is to contact the people who run their network and get them to learn knowledge that has been known for well over 20 years: do not filter ICMPv6, it is essential.
TCP connection stuck from routed subnet machine but works from tunnel endpoint machine
Shadow Hawkins on Wednesday, 16 December 2015 14:00:27
Hey,
alright. I'll see if I can get hold of someone there.
Thanks for your help.
mo
Posting is only allowed when you are logged in. |