Creating a VPC in AWS



In AWS terms a VPC is an isolated network environment. It is like your own lab network - complete with an internet connection, routers, switches and of course computers.

You do not set up a lab just for the sake of it. Rather you set up a lab to contain a set of computers. These computers - or instances - do your major productive work. They serve websites, host databases, run algorithms on data and so on. Along with these major players, you have other types of computers doing things like monitoring the computers, load balancing, traffic routing etc. The whole point of having a vpc - a virtual network - is to someday have lots of (virtual) computers. So be prepared for it.

Regions, VPCs and Availability zones

An aws region is one datacenter - in a city. So we have regions like N. Virginia, Oregon etc. Almost all things in aws are confined to any one region. 

VPCs reside within a region. You cannot have your network span across Ohio and Mumbai. You can have one vpc in Ohio and another in Mumbai but not the same one.

Availability zones (AZs) are rather tricky one. Imagine an AWS datacenter has multiple buildings. Each building separate from each other, have different internet connections, have separate electricity supply and separate generators. Now imagine it is done in a way that due to some reason if even a whole building fails, then there would be other buildings within the same data center which would keep working. Now these buildings can be named AZs. Each region has somewhere like five or six such buildings (AZs).

A VPC cannot span across regions, but it can (and should) span cross multiple AZs within its region.

Creating a VPC

We will not use the Start VPC wizard. Rather we go the hard way and Create VPC. We will need a name, a ipv4-cidr block and tenancy. Name is easy [Citation needed]. Keep tenancy to Default unless you really need dedicated tenancy.

IPv4-CIDR Block

TLDR; Put here any one from:
  • 172.16.0.0/16
  • 172.17.0.0/16
  • ...
  • 172.30.0.0/16
  • 172.31.0.0/16
This defines the range of IP address that your instances will get. This is not the only addressing scheme. There could be many different schemes (ex. 192.168.0.0/24 or 10.0.0.0/20). We will need a separate post to cover what/how/why of these schemes. For now stick with one from the above list. TODO cidr

Subnets

You can imagine a subnet as one server rack. Within one building, you can set up as many racks as you want. One rack belongs to one and only one building. You cannot carry it later to a different building (az). The individual computers (instances) go within these racks (subnets).  Imagine an aws-subnet as a very large rack capable of holding thousands of instances.

To create a subnet, you need to specify which VPC and which AZ. Remember - you are setting up a rack (subnet) - in a building (AZ) - in a datacenter (region). 


IPv4 CIDR Block (again!)

For now put here any one from x.y.z.0/20 where x.y comes from first step. So if you started with 172.17.0.0/16 in the first step, then here your x.y will be 172.17. For z, put any one of the following values: 0, 16, 32, ... 224, 240. You will notice that in this way we can form 16 unique values - giving us 16 possible subnets in the VPC.

Again, this is not the only scheme. You can have more or less subnets but that demands firm understanding of subnetting and some boolean algebra.

Let us create four subnets as shown in picture:
Idea is:
  • Have two types of subnets - PUB (for public) and PVT (for private). I will go to difference soon.
  • Each of pub and pvt should be in at least two different AZs.
  • Reiterating, both of the PUBs should not be in the same AZ. Both of the PVTs should not be in the same AZ.
Here I have used AZ-a and AZ-b for either of pub and pvt. You can choose to use four different AZs for four different subnets. Only thing to consider is communication between instances within the same AZ is slightly better than those across AZs. Here better means faster and stabler. Here, communication between instances in pub-a and pvt-a will be slightly better than communication between instances in pub-a and pvt-b. Same goes for b/b and b/a combination. 

Differentiating between PUB and PVT

The public subnets are supposed to hold instances which are directly accessible from the internet. The thing is, you do not want any instance to be accessible from the internet. The internet is dark and full of terrors - literally. Anything that is accessible is hackable. So you place only very few, very strong, very secure instances in the public segment of your network. If you have ever put computers on open internet, you will know that these machines face constant barge of hacks.

AWS by default does NOT provide public IPs to instances. Without a public IP it is (generally) not possible to be accessed from internet. Some computers like web-servers require a public IP (more on that later). We put such instances in one of the PUBlic subnets. Select each of the pub-a and pub-b one by one, go to Subnet Actions > Modify auto-assign public IP > Enable


Instances launched in these subnets will have a public IP. Having a public IP is necessary but not sufficient to have internet. We need three more things - firewall (security-group), gateway and route. At this point your subnets should look like this:

Security Group 

A security group is a set of firewall rules. You can apply one or more such groups on an instance. Each rule in a group is an instruction - to allow some port - for some protocol - from some source ip. The best practices of configuring security groups or any firewall in general is beyond the scope of this text. Here I will go with a simple scheme.

WARNING! The following text is NOT vetted for production grade security.

There is already a Default security group in every vpc. Ignore that for a moment. Create two security groups, one for public subnet and one for private subnet. Please note that security groups do not have any correlation with subnets in general. They are just a bunch of rules. So here we are:

pub-sec-grp

In this article we are only covering the inbound rules, not the outbound rules. What this means that our instances are allowed to talk to anyone on the internet, but the strangers on internet are not allowed to talk to us. A better approach is to restrict both inbound and outbound traffic, but that is for other day.

As shown in the screenshot, we have 4 inbound rules. We allow HTTP(S) requests so that we can run some web server and access it over the internet. Note the source of 0.0.0.0/0. In current context it means any IP.

Note the SSH and RDP entries. To need to yourself access your instances. You need to install software, host websites, take backup etc. Generally you access linux instances through SSH and windows instances through RDP. So we allow these two protocols also. What is special about it that you cannot access these ports from anywhere but only from a definite IP address. I have put a random IP, but you should put your office IP. Hopefully you have a static IP at your office.

Here is how this works: You launch an instance and associate it with this security group. If your instance has a public IP, you can try to connect to it from internet. Every incoming connection to your instance is evaluated against these rules. For example if you try connecting to MYSQL at 3306, it will outright deny. If you connect to RDP, you have to do it from your office, otherwise it will deny. Even from office, once amazon's firewalls let you through, you can have one additional firewall at instance level. All windows machines have firewall enabled by default. For linux instances you have to do iptables manually once. The windows firewall by default allows you to RDP into it, but not to ping it. You have to log onto windows instance and configure the firewall yourself.

pvt-sec-grp

On your network there will be machines which are not supposed to be accessed from open internet. Take the database server for example. TODO rds Whether you use SQLServer or MySQL or anything else, the only one that should access your database is the web-server itself. Instances like these are
  1. placed in private subnet - i.e. they do not have a public ip address.
  2. associated to private security group - i.e. firewalls prevent any incoming traffic to them.
Either of the two approaches above are sufficient to provide decent isolation, but for safety reasons we apply both approaches.

Note the two firewall rules. First one allows all traffic from pub-sec-grp. This means the instances like web servers can freely communicate with backend services. This behavior is sufficient for the current text, but is not production grade. Web server really does not need to send any type of traffic to mysql instance. It only needs to send mysql packets. So the All aspect can be fine tuned. But that is for some other post.

Notice the second entry - All traffic from pvt-sec-grp. What is this? This tells that the instances belonging to private security group can exchange network traffic among them freely. This wa not allowed in the pub-sec-grp. There we were more cautious.

How do we access a private instance? We RDP or SSH into a public instance and from there we second-level RDP or ssh onto the internal private instance. In the current text, we have only the web-server in the public segment, so to access database server, we ssh into the web server and then we again ssh onto the database server. In a more critical environment, we might want a public instance dedicated for SSH'ing or RDP'ing into. You can also get through the VPN backdoor. More on security later. TODO

Gateway

So far we have a network with partitions and segments. What we do not have is any real hardware. There are no computers in our network. We start adding hardware with the most critical thing - the router. In real life, routers are one expensive piece of equipment. Having and maintaining one is enough of a headache, forget about having multiple for redundancy. In aws though, we have gateways - managed routers. 
  1. VPC Dashboard > Filter by vpc > None
  2. Internet Gateways > Create internet gateway > name > yes, create
  3. Select the new gateway > Right click > Attach to vpc > select vpc > yes, attach
Thats it. It is so simple to provision a router for your network. What is great is that (a) the router itself is free - no rent for router itself, only for the traffic flowing through it (b) the router is scalable - you do not have to care for the load on the router.

One vpc can have at most one gateway. It can even have no-gateways! If your instances do not need internet then you can go this way, but without internet how will you access them? Right VPN.


Route Table

We have a router and four subnets - two public two private. The last leg of setting up the network is configuring the route tables. This is not as hard as it sounds. We start with a default route table called main route table. Like all default things, we ignore this one and start creating our own. We will create two route tables: one for the public segment and one for the private segment.

pub-route-table

Create one route table in your vpc and give it a name it pub-route-table. Configure it as follows:

  1. Routes > Edit > Add > Destination: 0.0.0.0/0, Target: internet-gateway > Save
  2. Subnet Associations > Edit > Tick pub-a and pub-b > Save
What we did is to instruct the router that computers in pub-a and pub-b subnets will reach internet via the gateway. Destination 0.0.0.0/0 practically means "the big bad internet". Remember a similar entry in the security group firewall? There the source of traffic was "the big bad internet" i.e 0.0.0.0/0. 

pvt-route-table

Create another route table in your vpc and name it pvt-route-table. Configure it as follows:
  1. Subnet Associations > Edit > Tick pvt-a and pvt-b > Save
Thats it. We do not want the secure computers on the private segment of our network to be exposed to the bad influence of the internet. Yes! A good way to secure computers is not to put them on internet. There is a middle ground of having internet with some compromise is by using NAT. We will keep it for another post.



Comments