Windows Kills SMB Speeds When Using Tailscale

104 pointsposted a day ago
by salmon

42 Comments

luma

a day ago

This is framed as a problem with windows, when it’s clearly a problem with tailscale misreporting its capabilities to the OS. If I have a 100gbit and a 1gbit interface, it’s perfectly reasonable for the OS to auto assign route metrics to prefer the much faster interface.

This is the OS working as designed, switching to Linux won’t help. Tailscale needs to do a better job reporting link characteristics.

windexh8er

a day ago

This actually would have nothing to do with the problem (advertised link speed) if Tailscale had more fine grained control over how SR routes are distributed. Currently it's all or nothing and there's no way to ignore specific routes if you're local to that network. Route length trumps network metrics, so it would be better handled in that manner.

In almost every OS I've seen interface metrics will only be used for equal cost route lengths.

cosmotic

18 hours ago

Using actual throughput would be a nice improvement over using link speed. In this case, just detecting the current throughput is a tiny fraction of the link speed and falling back to another link would be a huge improvement.

user

a day ago

[deleted]

windexh8er

a day ago

This isn't a Windows problem. The OP would experience the same problem on Linux. I've run into this with SRs. I believe I may have even opened an issue with Tailscale to detect when a client is local to an exit and/or provide more fine grained route ingestion depending on where the client is with respect to the SR.

But... Again, not a Windows problem. It is easy to fix by just advertising a longer route. But that implies you won't clobber other things. By default a more specific route will be chosen so a longer route advertised on the TS interface won't be selected.

muststopmyths

a day ago

So, a virtual adapter advertises 100Gbps link speed, but is not capable of delivering that and the takeaway is "Windows kills..." ?

How do other OSes handle the situation of having two interfaces with identical routes to a given destination ?

I don't see a better solution than using link speed, but I haven't thought about it too deeply.

windexh8er

a day ago

Tailscale shouldn't advertise a route that is local to the machine. This is a routing loop. The way SR route distribution works in Tailscale is that you accept all routes or nothing. Routing platforms have the concept of route filters to prevent accepting an advertised route that would create a loop.

There are hacky ways around this without having to deal with metrics (just advertise a /23 instead of a /24 and the /24 will be selected by default). But if you've got contiguous subnets you may not be able to clobber the additional address space just to avoid the route.

RockRobotRock

a day ago

I really thought Tailscale would automagically figure this out. If this were true in all cases, my internet would not work at all since it would try to reach my router through the Tailscale interface.

It's odd.

windexh8er

a day ago

The SR can't route for your gateway or else Tailscale itself would break it's own connection. A gateway IP isn't treated the same as a subnet route.

You don’t have any specific routes to random internet addresses though. And Tailscale would not either. Unless your Windows server is running BGP, all your Internet traffic is hitting the default route.

insaneirish

a day ago

I feel like this whole thing buries the lede a bit.

Yes, turns out running overlay/VPN type things disrupts traffic patterns. This is a non-story.

But we're talking about using wireguard on a local network, so the actual interesting question is: why does it cause the performance to plummet? Is it an implementation issue or something more fundamental?

I expect some performance impact. I don't expect a three orders of magnitude impact (which is what 355 KB/s imputes).

thowawatp302

a day ago

It’s TCP, so bandwidth-delay product, if the hairpin that gets the traffic back to the local lan does anything non-trivial.

I check the "Allow local network access" in Exit Nodes, then it transfers at max speed over local Ethernet.

windexh8er

21 hours ago

This doesn't actually affect anything if you're accepting Tailscale SRs. The conflict the article states is accepting a route advertised by Tailscale for their local network (the SR route) while on the local network (same network as the SR route). This forces all traffic through the wireguard interface, then it's routed to the SR and then back out because the interface metric is better than the hardware because of the link speed advertised. This is the root of the bandwidth issue.

The "Allow local network access" is an IP filter that's put into place or not.

bGl2YW5j

a day ago

Thanks to the author for this!

What oddly coincidental timing ... I finished setup of Tailscale just yesterday and ran into this exact issue when testing it. I didn't think too much of it and blamed the USB connection I'm using to connect my external drive.

accrual

a day ago

It was nice to see PowerShell could change the interface metric when the adapter GUI refused due to the empty IP field. I bet that check has been there since the 90s.

It makes me a little happy when a new CLI is able to do something the old GUI cannot!

ygra

a day ago

It's not just the new CLI. I guess you could have done the same with netsh for ages as well.

accrual

7 hours ago

Nice, you're right, looks like `netsh` can do it:

    netsh interface ipv4 set interface "<if>" metric=<metric>

magicalhippo

a day ago

I have my desktop PC connected to my TrueNAS box via both regular 1GbE via switch and a direct 10GbE link. I experienced similar issues where sometimes Windows would pick the sub-optimal interface.

I decided to brute force it, by editing my hosts file on Windows and adding a custom entry for the static IP assigned to the 10GbE adapter in TrueNAS. So if my NAS was named "mynas" I'd add a "mynas10" entry in hosts file.

caconym_

a day ago

If Tailscale is being used for remote access to the author's LAN, why is it running on a desktop that's always physically connected to the LAN? I have a similar setup for remote access but using Wireguard instead; my main router (pfSense VM running on Proxmox like the author's thing) handles the tunnels and routing for the remote subnet(s), and it all Just Works. Only the devices that actually get used remotely need to be set up as Wireguard peers, and they're configured to disconnect from the tunnel when they're on my home wifi. IIUC Wireguard automatically does the setup/teardown of routes on those peers when it's toggled on/off.

RockRobotRock

a day ago

>If Tailscale is being used for remote access to the author's LAN, why is it running on a desktop that's always physically connected to the LAN?

Because it's probably not only used for that. Personally, I want to access my local network segment from anywhere, and at the same time SSH into a cloud box without exposing port 22 to the internet.

Tailscale does the second one really well. I've also had problems with route loops which is why I've avoided the subnet router feature.

caconym_

a day ago

> Because it's probably not only used for that. Personally, I want to access my local network segment from anywhere, and at the same time SSH into a cloud box without exposing port 22 to the internet.

In my Wireguard-based setup there is no difference between the former and the latter. Remote peers connect to my router via a single open Wireguard port and then routing goes both ways—remote to LAN, LAN to remote, and also remote to remote via my router. Machines on the LAN have routes to any other LAN or remote machine without needing multiple interfaces or any local VPN configuration.

For some people Tailscale's features will be game changers (NAT hole punching, automatic DNS for all tailnet clients across multiple subnets, etc.) but I'm afraid OP may be using Tailscale as a crutch rather than getting his router sorted out properly, and the result is this weird redundancy of core network functions covering the same set of machines.

It's not even really a Tailscale problem per se, though I guess if you have machines naively connected to a Tailscale "subnet router" analogous to how my network is set up, you may not be able to take advantage of the full Tailscale feature set.

jeroenhd

a day ago

> If Tailscale is being used for remote access to the author's LAN, why is it running on a desktop that's always physically connected to the LAN?

Tailscale has a few nice additional features as well, like automatic DNS assignment for hosts on the virtual network, generation of HTTPS certificates for those hosts, and, if you enable the right middleware in your locally run services, transparent authentication to web servers for computers on the network. If you're going all-in on Tailscale, you can use it to automate a lot of network management. That would require you to run Tailscale on all of your devices, though.

stego-tech

a day ago

Because, for whatever reason I’ve yet to grasp, homelab folks like to implement Tailscale as some sort of “secure virtual network” abstraction layer - think something similar to zScaler ZPA - on top of their local LAN. To be fair, I didn’t think Tailscale did a good job explaining why this isn’t a great idea last time I tinkered with it in 2022.

If you can juggle SSH keys and forward ports on your firewall, you can just run plain old Wireguard. Don’t use Tailscale as a network abstractor unless you know what and why you’re using it that way for.

jauer

a day ago

> Because, for whatever reason I’ve yet to grasp, homelab folks like to implement Tailscale as some sort of “secure virtual network” abstraction layer - think something similar to zScaler ZPA - on top of their local LAN.

This is Tailscale's intended behavior, not a matter of how homelab folks like to implement it: https://github.com/tailscale/tailscale/issues/659#issuecomme...

RockRobotRock

a day ago

Maybe I'm not understanding properly, but why can't my device ARP ping and handshake with the subnet router to determine that I'm on the local subnet and to stop routing it through Tailscale?

jauer

a day ago

Tailscale intentionally overrides your device's routing table to force traffic between hosts in the same subnet to go over a Wireguard tunnel instead of bypassing it. They do this because they believe that the presumption that a local subnet is trustworthy is false.

lmm

a day ago

It could, but the Tailscale devs don't consider "silently start leaking traffic to anyone on the local subnet" to be a desirable feature.

stego-tech

16 hours ago

This is why I (thought I) prefaced my gripe with the context of date and documentation. Looking at modern docs, yeah, it absolutely looks like it’s trying to be a Freemium alternative to something like zScaler but on top of Wireguard (virtual secure network), but the OP’s article still makes me bristle because it demonstrates the lack of knowledge of the implications of that deployment model.

Case in point is that their grievance is about SMB to their NAS being routed over Tailscale despite being on the same network as the SMB endpoints. Ideally this is something that should’ve come up during the architecture phase of deployment: how should traffic be handled when both machines share the same network? When should Tailscale’s routing table prefer the local adapter over the Tailscale adapter? If Tailscale cannot be configured to advertise a specific link speed that accurately reflects network conditions, how can we apply policies to the endpoints to route traffic correctly?

I admittedly used this article as a personal soapbox to yell at (software) folks to get out of my lane (IT), and that was a fault of mine; I should’ve taken more time to articulate the pitfalls of these sorts of rapid deployments homelabs can facilitate, and share my expertise from my field with others instead of grandstanding. That’s on me.

I needed access to my home NAS and linux GPU box while visiting family last year over the holidays. I was in a rush. I spent 45 minutes trying to get Wireguard configured and working, then tried Tailscale and had the network I was looking for in 15 minutes. I'm not a homelabber. I hate network admin.

Is Just Works™ / being moron-resistant, with good first-party client apps, a bad reason to pick Tailscale?

stego-tech

17 hours ago

Of course not - if it works, it works, and I won’t fault folks for using Tailscale (heck, I like Tailscale, but I just got Wireguard working suitably for my needs first). My gripe was more that folks use it for a virtual network on their home LAN without seemingly grasping the implications of such abstraction - kind of like how the trend during the pandemic was “everything in Kubernetes” even though VMs might have been a better fit for their given problem.

If you’re willing to put in the effort to make it work, then go for it, but I just caution folks to understand there might be better solutions to consider - and that especially when talking about abstraction layers, you absolutely need to understand the implications of said layers before deployment.

user

a day ago

[deleted]

Animats

a day ago

Ah, non-transparent middlebox trouble.

wtcactus

a day ago

Does this also happen in Zerotier?

Don't take me wrong, I think tailscale is absolutely great, I'm just interested in trying Zerotier for a while since it has integration with OPNSense (in the GUI, I know tailscale works fine if you install the package and configure it manually).

hk1337

a day ago

I don’t think this is exclusive to Windows. SMB is a crappy service for anything outside local LAN. I am not too familiar with Tailscale but from what I understand, it’s basically akin to a VPN.

Article spoiler: the issue is Tailscale. SMB and Windows are red herrings.

True. Have exactly same speed issues with SCP as with SMB. It also depends on exit node used - some exit nodes give 10MB/s speed, same give 1MB/s. Doesn't work without exit nodes at all - cross-border blocking issues.

dawnerd

a day ago

Sounds like ya it’s a different issue here but I can confirm using WireGuard absolutely destroys smb performance. It’s not as bad on windows but on Mac it’s basically unusable.

karlgkk

a day ago

“I don’t know what I’m talking about but here’s my opinion.”

Thanks for your contribution

mixdup

a day ago

Did you read the linked post? It actually has nothing to do with SMB

leshokunin

a day ago

I’ve been curious as to why SMB seems to get little attention, and NFS even less. I had to go through hoops to even get NFS working at all on Windows.

I treated myself to 10GbE a while ago, and it feels like the protocol side of this is something that just gets overlooked. Unclear why. Maybe people just assume once it works, it works?