The hosts file is an entity that is present in every major operating system, including MacOS, Windows, and Linux. In Unix-based systems like Linux and MacOS its filepath is /etc/hosts. In Windows its filepath is C:\Windows\System32\drivers\etc\hosts. Although largely a relic of the past, this oft-ignored configuration file can be surprisingly useful even in the modern age.
First a little history on this file. In the days before DNS, all resolution of domain names to IP addresses was done by static lookup. The domain name resolver was an extremely crude system that simply looked up the domain name in a file called hosts.txt and matched it to the proper IP address. The hosts.txt file was simply a table listing address/domain pairs, and it had to be edited manually. If a domain was missing from the table, or worse, the address had changed, the name resolver was screwed. Needless to say, network admins had to spend an inordinate amount of time tediously updating this file by hand. Those were the Bad Old Days.
Now almost all domain name queries are done dynamically through DNS, and the hosts.txt file – now called simply “hosts” with the file extension omitted – is only used for a handful of very basic things like the loopback address. But it still works for regular domains, and you can use this to your advantage if you need to bypass DNS.
So why bypass DNS anyway? Well, there’s a couple of reasons, and both of them are security-related. First of all, if you’re using a VPN, you don’t want your ISP to be able to see your DNS queries, and they will if you’re using DNS, because they control the DNS servers, and they can use the packets going in and out to track what websites you are visiting. This is a privacy vulnerability, and it’s known as a DNS leak. Even worse, the DNS server may be controlled by Google. In fact many DNS caching servers on home routers are preconfigured to use 126.96.36.199 – that’s Google’s DNS server.
“But wait. I’m not doing anything wrong. I got nothing to hide.” Okay, I hope you don’t mind potential future employers buying your browsing history and learning about your porn habits. And you’d better hope they’re not offended by your personal kinks. A couple years ago, a prominent Drupal developer was fired from the project after his boss learned that he had a thing for dominating women. You don’t have to be planning domestic terror attacks or selling drugs or looking at kiddie porn for your browsing history to come back to haunt you, especially in these politically correct times when a single slip of the tongue after a long day can cost someone their entire career. You never know how your data will be interpreted, you never know what could be used against you, and you never know what someone will choose to be offended by. So use a VPN.
The other reason you might want to bypass DNS is if you want to block certain websites entirely. These would typically be websites like googleadservices.com that perform covert surveillance and targeted advertising through other sites. You won’t be able to block these companies completely due to how ubiquitous they are, but it is at least a start. Remember, privacy is not a game of perfection; it’s a game of optimization.
Here is a screenshot of my hosts file that shows how to do both of these things:
The format of the hosts file is very simple. Each line consists of the IP address to map to (usually tabbed) followed by any number of spaces (it’s important to use spaces here, not tabs; if there are tabs between the two fields it won’t work) and then the domain that you want to map to that IP address.
The hosts file shown above has two sections. The first section lists the domains I want to block. Basically Google Ad Services and Facebook. At the time of this screenshot I had Google blocked, but then I found that Captchas took infinitely longer afterward even after I unblocked Google’s domain (Google is evidently trying to punish people who try to escape their botnet), so I had to permanently remove it from the list. The way this block list works is fairly simple: Each domain is simply remapped to my loopback address, so if any web application tries to phone home to any of these domains, the request will simply be redirected back to my computer. I have effectively trapped these spyware apps in my own system, cutting off their contact with the corporate servers back home. Sure, I can’t use Facebook at all now, but in this day and age, no one should be using Facebook. That’s just my opinion.
The next section is for domains that I want to access by IP. This means that when I type the address into my address bar, it will consult the hosts table instead of a DNS server. This is because like anything in technology, there’s a hierarchy that these programs consult: first the hosts file, then the caching server, then a primary name server. If the address is in the hosts file, it will stop looking there and will immediately convert the domain to an address. Thus it will never contact the DNS server and no one will be able to see where I’m going online. DNS leaks become a non-issue, at least for the domains I’ve listed.
There is a downside to this, which is that if any of the addresses change, I will have to update the file manually. Also, sites that are protected by Cloudflare won’t let you access their domain by IP address, so this only works for sites that don’t use the service. This is sort of a temporary solution for me. Eventually I want to set up my own DNS server so I won’t have to use quick-and-dirty methods like this. In fact I’ve been working on learning how to run the BIND name server on a Raspberry Pi. That’s a potentially exciting project that I will hopefully get around to fairly soon. Until then, farewell and happy hacking.