Network Engineering #1

One of the most fundamental concepts you must understand as a developer or a DevOps professional is how computers communicate with each other through the network. This knowledge forms the foundation that supports many applications and services we use on a daily basis, such as accessing web pages and applications, downloading digital videos and audios, and connecting to printers.

OSI Model

The theoretical framework underpinning all the networks we use today is the Open System Interconnection (OSI) model, which consists of seven layers: application layer, presentation layer, session layer, transport layer, network layer, data link layer, and physical layer. To understand what they do, let's use an example of a private home network where you want to access a web application running on your laptop via your phone.

When the phone sends the request, it goes through all seven layers from top to bottom, starting from the 7th layer and ending at the 1st, so that the request can be gradually formatted for faster analysis by the receivers. (Note: Be aware that the descriptions I provide for each layer highly abstract away details.)

Layer 7. Application Layer

If you're familiar with web development, you’ve likely performed an HTTP request to a URI, which is the job of the application layer. The phone will attach the IP (Internet Protocol) address, port, and other information needed to send the request. The IP address specifies the variable address of the network node, while the port identifies the connection point for the specific service running on the computer.

Layer 6 & 5. Presentation & Session Layers

The presentation layer optionally performs encryption on your request information so that it is protected against interception. Whether encryption is performed depends on the type of protocol used in the application layer. After the optional encryption, the session layer attaches session information to the request for organizing multiple requests and responses.

Layer 4. Transport Layer

This layer is responsible for breaking the request down into multiple manageable data segments, while attaching various information, most importantly port information, to each segment. The source port (the port of the phone sending the request) is attached at the beginning, and the target port is attached at the end, ensuring the request is verified by the target and sent back to the correct source port.

Layer 3. Network Layer

This layer further breaks the segments down into multiple data packets if necessary and attaches the phone’s IP address at the beginning and the target computer’s IP address at the end of each packet. This ensures that the receiver can verify if the request is indeed sent to them.

Layer 2. Data Link Layer

The data link layer receives the data packets and attaches the MAC addresses of the phone and the target computer at the beginning and end of each packet. The MAC address is a permanently burned-in hardware address of the networking chips embedded in computers. Unlike the IP address, the MAC address cannot be changed.

Layer 1. Physical Layer

Finally, the data packets are converted into bit signals, 0s and 1s, expressed using electrical signals or radio waves at a certain bandwidth, and sent via cables or air (Ethernet or WiFi). Ethernet is generally considered more secure than WiFi because broadcasted radio waves are easier to intercept than electrical signals sent via a physical cable connecting specific devices.

Why All This?

You might wonder why all these steps are necessary for simple communication between a phone and a computer. The reason for breaking the task up and wrapping the data in this structured way is to make it easier for the receiver to analyze the data by pealing each layer of the data. For example, when you send request signals via WiFi, all devices in close proximity receive the signals and start analyzing them.

First, the binary data is converted into packets, and the MAC address at the end is checked. If it’s not your MAC address, you can immediately discard it regardless of what is underneath. Only when the MAC address matches do you analyze further and check if the IP address matches. This process continues until you're sure the request is meant for you and your specific service. This divide-and-conquer approach is a common pattern in computer science.

(Note: Whether to discard irrelevant requests is your choice. Network protocol analyzers like Wireshark work by not discarding intercepted data, allowing analysis of all captured traffic regardless of the intended recipient.)

IP Address

Public vs Private IP Address

IP addresses come in two types: public and private. A public IP address is a unique address exposed to the world outside your home network, allowing communications with external servers like Google and YouTube. How we identify the IP address of these services and send requests will be covered in future articles.

A private IP address, on the other hand, is the address of computers within the same private network, typically assigned by a router within a certain range (e.g., 10.0.0.0 to 10.255.255.255). Private IP addresses are broadcast and often known by other devices within the same network, allowing requests to be sent and received smoothly. This separation reduces security risks and allows multiple devices to share one unique public address for internet access.

IPv4 & IPv6

While the same private address can exist across different networks, public addresses must be globally unique to avoid confusion between routers. IPv4, created in the 1990s, uses 32-bit addresses (e.g., 10.255.255.255), allowing for roughly 4 billion unique addresses. However, the rapid expansion of the internet has outpaced the number of available IPv4 addresses.

IPv6 was introduced to solve this problem, using 128-bit addresses (e.g., 2293:dfd8:0da5:c165:94c2:3898:8de5:5a03), providing about 340 undecillion unique addresses. In addition to a vast number of combinations, IPv6 offers numerous benefits, some of which we will cover in future articles.

MAC Address

Address Resolution Protocol

At the data link layer, we attach the MAC address, but it’s not readily available like IP addresses are. How do we obtain the MAC address of the target receiver? This is done using the Address Resolution Protocol (ARP), where the IP address is used to retrieve the MAC address. When the sender moves from the network layer to the data link layer, it broadcasts a request asking who has the target IP address. If the target is in the same network, it responds with its MAC address. If not, the router responds with its own MAC address.

The sender can then attach the MAC address and send the request via the physical layer. To speed up this process, IP and MAC address pairs are cached for some time.

Why Do We Need MAC Addresses?

You might wonder why both a MAC address and a private IP address are needed. Why request the MAC address with ARP when the private IP address already identifies the device? The answer becomes clear when you consider sending requests to a service outside your private network, like Google.

You determine the public IP address of the destination router (the details of this process will be covered in the future articles), attach it at the network layer, and use ARP to obtain the MAC address. Since Google is outside your private network, your router will respond with its own MAC address. You then attach this MAC address to your request at the data link layer and send the binary data at the physical layer.

When your router receives the request, it replaces your source MAC address and private IP address with its own MAC address and public IP address. Then, it uses the destination IP address and a routing algorithm to figure out how to reach the destination router. According to the routing algorithm, it finds the next immediate router to pass the request to and uses ARP to obtain the MAC address of that immediate router. It then rewrites the destination MAC address to the MAC address of the immediate router, without changing the destination IP address.

Regardless of how many routers are involved to get to the destination router (identified by the routing algorithms using IP addresses), this process of replacing MAC addresses without changing the destination IP address repeats until the request arrives at the final destination. In a nutshell, the IP address is used for identifying the final destination, while the MAC address is used for the immediate destination.

What does this have to do with direct communication between computers on the same private network? When you are connecting to other computers within the same private network, you technically don’t need to go through this process because the final destination is also the immediate destination. However, the process is still followed to maintain consistency (at least, that’s the best explanation I’ve found).

Conclusion

In this article, we covered the OSI model and introduced the basics of IP and MAC addresses. These concepts may seem overwhelming due to the number of acronyms and new ideas, so I recommend taking your time to review and explore alternative resources for deeper understanding. In the next articles, we will continue discussing the fundamentals of networking, so stay tuned.

Resources

Cyber Security Mysteries. 2022. Why Do We Need Both a Private IP Address and a MAC Address?. YouTube.
Nasser, H. 2019. The OSI Model - Explained by Example. YouTube.