Happy Employees == Happy ClientsCAREERS AT DEPT®
DEPT® Engineering BlogKotlin

What I learned about network programming with Kotlin/Native

A note to preface: you can find the project repository here.

The origin story

After a while developing software in Java, I was intrigued by all the praise of Kotlin and also interested in Kotlin/Native as a way to develop applications that don't require a VM to run. Inspired by this Golang implementation of traceroute, I settled on trying my hand at implementing traceroute with Kotlin/Native.

tracerte is already available on every operating system, so why the hell would I rewrite it? Good question. I had the opportunity to spend some time learning new things, and I'm always up for a good challenge. In this case, to fearlessly venture into the abyss of TCP/IP, network programming, and UNIX.

Ok I'll bite...what's Kotlin/Native?

Kotlin/Native is a technology that compiles Kotlin code into a native binary for a number of different platforms - no JVM required. As indicated in the Kotlin documentation, this technology is meant to bridge the gap when deploying to certain platforms:

Kotlin/Native is primarily designed to allow compilation for platforms on which virtual machines are not desirable or possible, such as embedded devices or iOS.

Kotlin/Native includes a wonderful capability, called cinterop, to easily make calls to C libraries.

My hope is that my project contains useful examples of what Kotlin/Native (and cinterop) is capable of accomplishing. The Kotlin docs are great but somewhat unclear for parts like memory management, casting, and other minutiae. There should be some good snippets from this project of what those trickier parts of Kotlin/Native look like in practice.

A nifty tool

traceroute (aka tracerte) is a common system utility that collects network latency information by probing each hop in between you and the place you're trying to connect to. In other words, we can find out the time it takes for the packet to get to each stop in the network on the way to the destination.

There are different ways to probe the network, but the most official one is where we send out ICMP "Echo Request" packets, and our destination will respond with an "Echo Reply" packet.

At the heart of traceroute is what happens between the origin and destination. Probing every hop along the way is made possible by the IP TTL, or time-to-live. This value is set in the IP header and defines when a packet expires on the network. Routers use this value to determine whether they should pass the packet forward to the next device. If a router does pass the packet forward, it decrements the IP TTL by 1.

So, as you can imagine, it'll eventually reach 0. And when the IP TTL is 0, the router sends a response, back to the origin, called a "Time Exceeded Error" message.

Here I used my project to query google.com, and captured this Time-to-live exceeded message with Wireshark (tcpdump icmp is also a handy command for capturing all ICMP traffic).

And that's basically it. traceroute first sends a packet with a TTL of 1, and then keeps sending more, incrementing the TTL by 1 each time. The program waits for each "Time Exceeded Error" message returned by every router along the way until there's finally a packet that doesn't exceed the TTL but instead reaches the destination. At that point, the destination will send back an "Echo Reply" packet.

But first, the internet

Beneath it all, the internet is made up of many devices speaking to each other using a number of protocols. All the protocols are part of what is called the OSI Model. The model is a bunch of layers, nested like a Russian doll. At the bottom, there are physical circuits that only have to worry about reliably transferring individual bits over a cable or an antenna. And at the top, there's someone writing lines of code to make an HTTP API request.

IP

Whether you realize it or not, when your computer is talking back and forth with other machines, it's exchanging data in the form of packets. Keeping in mind that there are multiple layers that allow devices to communicate over the internet, packets exist on the Network and Transport layer. And keeping in mind that everything generally follows a standard, packets follow a protocol called TCP/IP (transmission control protocol/internet protocol), or simply, the internet protocol suite.

TCP/IP is the central standard of the internet.

TCP & UDP

Each layer of the network stack has a different protocol, and sometimes there are even multiple protocols to choose from on a single layer. For instance, if you're working on the Transport layer, you'll often be choosing between sending a UDP or TCP packet.

You may know that UDP is for firing off quick messages. Messages that possibly disappear into the abyss, they might end up received or they may not.

Whereas TCP involves an exchange of messages in order to form a connection. The operating system keeps track of these connections. With TCP, devices on both sides get confirmation that the data got to the other end, without any errors, and that data will be received in the exact order it was sent.

Pick your protocol

It turns out that traceroute can be written to utilize packets at either level of the network stack: the Network or Transport layer. Remember that even though the packets may be different, this doesn't change the fact they all work by setting the time to live.

One transport layer method is to fire off a UDP packet to the device at each hop.

Another transport layer method, using what's called a half-open scan or SYN scan, is to attempt to open a TCP connection with the device at each hop.

And then there's another type of packet, called an ICMP (internet control message protocol) packet, that's a layer above UDP and TCP, in the network layer. The confusing bit is that while ICMP is pretty much the same deal as UDP and TCP, there's one crucial difference.

ICMP, UDP, and TCP packets are all payloads nested in an IP packet. But the difference is that ICMP is a part of the IP standard, whereas UDP & TCP are just arbitrary data as far as IP is concerned.

So to review there are three techniques:

  • fire off an ICMP packet and listen for a response
  • fire off a UDP packet and listen for a response
  • start a TCP connection, listen for a response and close the connection

Trade-offs

ICMP is designed to be used in the way that traceroute uses it, for gathering information about the network. It's not meant to transfer any data, unlike TCP and UDP. Sometimes hops in the network will have rules that deny ICMP packets, so TCP and UDP are used as workarounds. While UDP may be able to get around rules that filter ICMP traffic, UDP packets are sent to a specific range of ports, and traffic on those ports may be blocked. Most devices allow TCP traffic on port 80, so that's a safe bet. Though with TCP we're now going to have to send a SYN packet, wait for the ACK and then send off a RST to close the connection.

The nitty-gritty

So now that we've covered the background, let's look at how I wrote traceroute. I chose to implement the ICMP method, so keep that in mind.

Programming with syscalls

When you're writing a script or compiling a program that interacts with the internet, it seems like things just magically happen, but under the hood, your interpreter or compiled code is talking to the operating system in order to send and receive data. As programmers, the way we can access the capabilities of our operating system is through a library of functions called "syscalls."

traceroute relies on a number of syscalls. I'll go through most of them.

Sockets

One important syscall is socket. Sockets are how we're able to ask the operating system to connect us to the internet. They are an interface to the internet in the form of a file. This is great since files are simple and familiar: you read and write to them.

This implementation uses two sockets, one for sending packets and one for receiving packets. This is easier than, say, using just one, since socket works as a filter for different sorts of packets. With a socket, you can subscribe to the sorts of packets you want to receive. So in the receiving socket, I set a filter for ICMP packets and for the sending socket I'm specifying that I don't need any help from the OS to write my IP headers.

Sometimes store-bought isn't fine

When you open a socket you have to specify which protocol you want to work with. If you were to just want to send a TCP or UDP packet you may be happy for the OS to work out all of the details, so you'd call socket with SOCK_STREAM or SOCK_DGRAM (more info can be found on the man page for socket). This is an off-the-shelf solution to network programming, it makes life easy!

But in my case, I wanted to write and receive the entire packet, IP header and all. And to do this I had to open what's called a raw socket. A raw socket gives complete access to the entire packet. In fact, since ICMP is a control protocol, a part of IP, raw sockets are required in order to work with those packets. Raw sockets are opened by calling socket with SOCK_RAW. An important note: programs that open raw sockets require root privileges in order to run.

Maybe it dawned on you that you're able to run traceroute on your computer without sudo. That's probably because your traceroute executable is owned by root and has the sticky bit enabled - a file permission that allows users to execute a file as the user that owns the file. On Linux there's a smart alternative method to allow users to execute a file with elevated permission, without willy-nilly bestowing root privileges, by setting POSIX capabilities instead.

Checksums

When you use a raw socket, you're signing up to be responsible for everything in the header. Or at least almost everything, some kernels will fill in the IP checksum for you.

In any event, the ICMP header does require me to create a checksum of the entire packet, so the receiver can verify that the request hasn't been corrupted or written wrong.

Call and response

Once you've made a socket, you need to use it. In my implementation, I use, in order: sendto, select, recvfrom and close.

  1. sendto fires off the packet, our internet probe.

2. select is used to tell us whether our file descriptor (a number that's assigned to our socket) has any data ready for us to read it.

3. recvfrom reads the data from the socket into a memory address we can then access.

4. close frees up the sockets we used to send and receive packets.

All the rest

  • getaddrinfo: when you tell traceroute to query an address like google.com it needs to resolve that into an IP address, and this syscall gets us a linked list of IP addresses for that hostname.
  • getifaddrs: since we need to send a packet to a remote place and wait for a response, we have to tell that place where to send it back. So with this syscall we're able to get a list of the network interfaces available to our operating system (from the device it's running on).
  • inet_ntop: when we're dealing with addresses, they're almost always supplied to us in the form of a struct (in this case it accepts in_addr), and the fields are all a bunch of binary data (in network order). This syscall lets us converts the struct into a string value, the IPv4 network address in "dotted-decimal" format, i.e. "ddd.ddd.ddd.ddd".

A big monkey wrench

After meticulously writing a bunch of code to put together the complete IP packet with ICMP payload, I hit a real hard roadblock sending out the packet I had  put together. Everything seemed right. My checksum code was fine. All the fields in both packets were correct. But the sendto syscall would always fail. Invalid argument was the only error information it returned. And to make matters worse, thanks to a macOS security feature called System Integrity Protection, it's impossible, or at least difficult enough that I just gave up, to run something like strace to debug the failing syscall.

After some trial and error, I found that changing around the ip_len field would send out my packet. Using tcpdump, I could see the packet send, and then the response get received. I could tell the packet that got sent had a weird data payload, but sendto wasn't failing outright.

So with that hint, I managed to dig up a wonderful blog post that explained there's a long-standing quirk in BSD-family kernels where two specific fields, ip_len and ip_off need to be set in host order. Talk about an obscure error!

Lessons learned

I didn't really know how this project would play out, and it's been a twisty path sorting out a lot of different things I haven't worked with before.

First, the importance of not giving up prematurely when you're learning something. You never know when you're right on the cusp of solving an issue. I find that if you're truly stumped by something, step away for a bit - a day, a week, maybe even months - and come back to the problem with a fresh mind. It worked for me on this project.

Second, don't be intimidated by seemingly complex subjects. The patience to painstakingly piece together knowledge like this is a skill that can be learned through practice. I had the impression of network programming as something reserved for a college course, but in my experience, it's totally possible to teach yourself.

Third, I think books are an under-appreciated resource! For network programming, I can't recommend UNIX Network Programming, Volume 1: The Sockets Networking API by W. Richard Stevens. Tutorials are fine for some things, but it's hard to find another format that presents so much information in a clear and complete way. When learning a new topic seek out a wide array of sources to try out what works best for you. Sometimes you just need to find the information presented in the right way and this isn't necessarily the same for everyone.

I'd be remiss not to mention my impression of using Kotlin for the first time. Kotlin/Native is a great platform and its interoperability is very powerful. That said, it's probably not wise to just go ahead and rewrite C applications in Kotlin. I found debugging to be hard, and it's easier using the very well-developed tooling around C and related languages. If you're using Kotlin/Native, it's best to stick to what it excels at: interoperating with native libraries to build multi-platform native applications.

This project hasn't been easy, but that made it all the more rewarding and I've learned a lot along the way. I think anyone that wants to fill out their knowledge about the internet should just jump into the deep end and give network programming a try. I feel more confident having peeked behind the curtain, and perhaps now network errors will be less of a mystery.