A terrible way to jump into colocating your own stuff

60 pointsposted 4 hours ago
by ingve

26 Comments

yjftsjthsd-h

3 hours ago

> 1. Install Linux on the box. Turn everything off but sshd. Turn off password access to sshd.

Also, test that it's properly disabled with something like `ssh -v yourserver : 2>&1 | grep continue`, because there are a surprising number of ways for that to go wrong (did you know that sshd lets you Include multiple config files together in a way that can override the main one? I know that now.)

a-french-anon

11 minutes ago

This. OVH's VPS had two .confs reenabling passwords. Now I know too.

Maledictus

2 hours ago

and `sshd -T | grep -i password`

mjevans

2 hours ago

They're absolutely correct:

" 1. Install Linux on the box. Turn everything off but sshd. Turn off password access to sshd. If you just locked yourself out of sshd because you didn't install ssh keys first, STOP HERE. You are not ready for this. "

If you blindly followed the directions and got locked out, you can do exactly the same thing with other directions. You were not ready.

nasretdinov

an hour ago

At least it doesn't say to set PermitRootLogon and remove the root password :)

kijin

an hour ago

The great thing about having unfettered physical access to hardware is that you can easily recover from mistakes like this. No need to rebuild that EC2 instance. No need to beg a hosting company for IPMI access. You can just pull the plug and try again as if were your own PC.

krab

2 hours ago

A bit less terrible way in my opinion:

Find a dedicated server provider and rent the hardware. These companies rent some part of the datacenter (or sometimes build their own). Bonus points if they offer KVM - as in remote console, not the Linux hypervisor. Also ask if they do hardware monitoring and proactively replace the failed parts. All of this is still way cheaper than cloud. Usually with unmetered networking.

Way less hassle. They'll even take your existing stuff and put it into the same rack with the rented hardware.

The difference from cloud, apart from the price, is mainly that they have a sales rep instead of an API. And getting a server may take a from few hours to a few days. But in the end you get the same SSH login details you would get from a cloud provider.

Or, if you really want to just collocate your boxes, the providers offer "remote hands" service, so you can have geo-redundancy or just choose a better deal instead of one that's physically close to your place.

rsanheim

2 hours ago

One hurdle that many companies who have only known cloud hosting will face here is significant: how do you find a reliable, trustworthy datacenter? One who actually monitors the hardware and also has a real human if your networking access gets screwed or if you need a critical component swapped at 2 am on a Saturday.

I used to have a short list of trustworthy companies like this I'd recommend to clients ~20 years ago when doing consulting. I think 3/4 of them have been gobbled up by private equity chop shops or are just gone.

Nowadays noone gets fired for going with AWS, or resold AWS with a 100% markup from a 'private enterprise cloud' provider.

krab

an hour ago

You're right you need to find a company you can trust.

And for a lot of startups it really makes sense to use AWS. But if you do something resource or bandwidth intensive (and I'm not even talking about Llama now), the costs add up quickly. In our case, switching to AWS would increase our costs by an equivalent of 4 - 8 devs salaries. After AWS discounts. That's a hard sell in a 15-person team even though half of our infra costs already are with AWS (S3).

vidarh

an hour ago

I often recommend a "embrace, extend, extinguish" approach to AWS: Starting there for simplicity is fine, then "wrap" anything bandwidth intensive with caches elsewhere (every 1TB in egress from AWS will pay for a fleet of Hetzner instances with 5TB included, or one or more dedicated servers).

Gradually shift workloads, leaving anything requiring super-high durability last (optionally keeping S3, or competitors, as a backup storage option) as getting durability right is one of the more difficult things to get confidence in and most dangerous ones to get wrong.

Wrapping S3 with a write-through cache setup can often be the biggest cost win if your egress costs are high. Sometimes caching the entire dataset is worth it, sometimes just a small portion.

krab

an hour ago

Well, S3 is hard to beat for our use case. We make a heavy use of their various tires, we store a somewhat large amount of data but only a minor part ever goes out.

The compute and network heavy stuff we do is still out of AWS.

vidarh

an hour ago

That's pretty much the one situation where they're competitive, so sounds very reasonable. Some of their competitors (Cloudflare, Backblaze) might compete, but the biggest cost issue with S3 by far is the egress so if not much goes out it might still be best for you to stay there.

Sounds like (unlike most people who use AWS) you've done your homework. It's great to see. I've used AWS a lot, and will again, because it's often convenient, but so often I see people doing it uncritically without modeling their costs even as it skyrockets with scale.

vidarh

an hour ago

This. I used to colo lots of stuff, but now mostly use Hetzner. But there are many in this space, and some of them even offer an API. And some of them (like Hetzner) also offer at least basic cloud services, so you can mix and match (which allows for even steeper cost cuts - instead of loading your dedicated hardware to 60% or whatever you're comfortable with to have headroom, you can load it higher and scale into their cloud offering to handle spikes).

The boundary where colo and dedicated server offerings intersect in price tend to be down to land and power costs - Hetzner finally became the cheaper option for me as London land values skyrocketed relative to their locations in Germany, and colo prices with them. (We could have looked at coko somewhere remote, but the savings would've been too low to be worth it)

seszett

an hour ago

The most difficult step I find is just barely mentioned, finding colocation space at reasonable price is difficult these days.

moandcompany

3 hours ago

Documenting and testing the hard reset/reboot procedure, as well as any expectations, for your colocated gear sounds like a good thing to add to this list.

bigfatkitten

2 hours ago

Test to make sure it'll actually come back after a hard reset.

Don't do what I did and put a Cisco router on a remote mountaintop radio site with the config register still set to 0x2142, and then go home.

pferde

an hour ago

Hey, that's just an excuse to go for another hike! :)

ggm

2 hours ago

Remote power management can be a godsend. If you can get an ipmi console in, you want it.

qhwudbebd

40 minutes ago

Ipmi is a bit of a double-edged sword. Network-connected access to a serial console (including uefi/bios console redirection) and the reset button can be a total lifesaver, I agree. I wouldn't want to be without a serial console and remote reset button either.

But ipmi cards are little network-attached linux boxes which run prehistoric kernels and userspace, exposing as many services as the half-wits who put together the firmware image can shovel in, and are rarely if ever patched by the vendor unless there's some really public scandal.

The standard thing to do is to isolate them on some kind of private management network in an attempt to shield the wider internet from the full majesty of the firmware engineers' dazzling skills, but that might be harder to do in the simple 'beginner' scenario Rachel describes.

One good simple version when you get up to two servers instead of just one is to cross-connect so each machine has ipmi access to the other, but neither ipmi service is exposed to the wider world.

michaelt

an hour ago

It is indeed very helpful when it's needed - but most IPMI is horrifically insecure, so usually it's not connected directly to the public internet. Instead it's on a separate network with special VPN access.

As this document focuses on keeping things simple, they don't have the isolated network / VPN needed to use IPMI.

cuu508

3 hours ago

What's the importance of having a switch/hub? Is it because you may want to add more servers later, but the colo host only provides one port in their router?

kelnos

2 hours ago

So you can plug your laptop into it at the colo when you set it up so you can verify that the server is working right and can get to the internet while you are still there. And if it isn't working right, it'll be a lot easier to keep everything plugged into the switch, rather than swapping between your laptop and the colo's network every time you try something to fix it.

But yes, if you think you might want to add other machines later, preemptively putting a switch in there will make it tons easier.

pests

3 hours ago

Basically. Need to pay for a bigger pipe. Same applies to power as well, gotta pay to get more capacity.

re-thc

3 hours ago

Also if you want to manage the IP ranges provided amongst >1 server.

behringer

an hour ago

Forget the switch and pi and get a ubiquity router. Much more powerful and simple to setup. Does require some knowhow.

Also you could see if your local hackers space has a server rack