Slow VPN - Please help! - Cabling-Design.com Forums

- B
- bob.szurgot
  
  Contact options for registered users
posted
18 years ago

Fri, Jan 27, 2006 6:25 PM

Hello all. I've been scanning every conceivable source looking for an answer to my problems. Quick description of our infrastructure: We have a central office/HQ hosting our (all W2K3) domain controller, print server, file server, terminal server and intranet server, as well as our applications/DB servers. Subnet of HQ (behind NAT) is 172.19.0.0/23. Using a Nortel 1100 "VPN-in-a-box" as our VPN gateway/statefull firewall/router. Internet connection to HQ is Full T1. We have 4 branch offices, each with a Netgear FVS338 VPN router/firewall. Remote subnets (behind NAT) are

172.19.1.0/24, 172.19.2.0/24, 172.19.3.0/24 and 172.19.4.0/24. Connectivity to each location is ADSL 3.0Mbps down/384Kbps (or better) up. I have succeeded in bringing up my tunnels to each location. ICMP pinging within the tunnel shows anywhere from 75ms to 110ms latency on average, internal subnet to subnet, ie: 172.19.0.1 to 172.19.3.128. Users at the branch office can authenticate back to the HQ, can open Terminal Server sessions, mapped drives, print to networked printers, etc. The problem is, performance is monstrously poor. I've observed the following, intermittantly:

- Tunnel drops anywhere from once every 2 hours to once every 2 days - A user's mapped drive or authentication will disappear every 15 to

20 minutes. - RDP Terminal Server sessions will "lock up", timeout and die every 5 to 30 minutes. - Print jobs will fail/timeout. - Slow, slow, slow, SLOW response at all branch offices. - However, individual users on home DSL connections with client software seen to be fine. - Measured packet loss LAN to LAN, WAN to WAN. Both sit at roughly 2%-3% (seems high).

After much research, I've tried the following:

- Set all MTU values to a lower level. Tried 1460, 1427 and 1400. Seems to have little effect. - Reduced the encryption from ESP 3DES/MD5 128bit with PFS, down to single DES/MD5 no PFS. Seems to have little effect. - Turned NAT traversal on/off, with (you guessed it) little effect. - Tried various combinations of "nailed up", keepalive and rekey timeout values. - Swapped a couple of the Netgear boxes for Nortel Contivity 200 boxes, no help. - Turned on every conceivable setting for "Netbios over TCP/IP". No good.

I've been told once or twice now that perhaps our DSL connections, with the

2% to 3% packet loss, are the real culprits. If that's the case, will "beefing up" the remotel locations to T1 dedicated make a difference? Is 2% packet loss a reasonable explanation for huge connection issues? And how about MTU: Other than setting the VPN routers and servers to 1400, do I need to do the printers, the switches, workstations, etc.?

It just seems like I'm missing something fundamental, that will solve all of my problems.

Any attention that you could bring to bear would be greatly appreciated.

Thank you for your time.

Bob Szurgot bobDOTszurgotATindianahardwoodsDOTnet

- S
- Simon
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sun, Jan 29, 2006 9:00 AM

the packet loss is pretty high and may well be the culprit. I would have also looked at the MTU size, setting the mtu size on the router should cause them to send back icmp messages to the clients allowing them to adjust the packet size, although it doesn't always happen. A quick test is to client vpn to the central site from the remote office over the tunnel and then see how things work - client vpn will use a smaller mtu by default.

- B
- bob.szurgot
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 3:22 PM

Using the VPN client from one of the remote sites, through the tunnel, yields the same sort of performance. Seems like MTU is not necessarily the culprit. And I've done a little more extensive test WAN to WAN to check for packet loss. I've basically put stand-alone laptops on each end of my WAN, and packet loss is very, very low. Under .4%. So now i'm not necessarily suspecting the DSL service itself, and I'm back to believing that there's some setting or combination of equipment that's causing these statistics.

Has anyone else seen this sort of behavior? It is most noticable in the middle of Terminal Server (RDP Client) sessions. The user's screen will lock up, or go black, and then they have no choice but to close the client, start a new local client session, and re-connect to their already-running server-side session. As you can imagine, this is getting extremely frustrating for my users.

Any help at all would be greatly appreciated.

Thanks.

Bob bob dot szurgot at indianahardwoods dot net

- S
- Simon
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 6:21 PM

Hi Bob, all I can think of now is to test the rdp/etc with the pc on the good 0.4 connection if you can, if that's ok then put back in the in the kit you took out bit by bit. It could be something off the wall like a net card or a switch causing the problem, even devices that auto detect duplex modes and get it wrong can cause these issues. simon