Porting DNS Code from Zig 0.15 to 0.16: IO, Queues, and Concurrency

· 965 words · 5 minute read

As helpful as everyone wants to be about teaching Zig, it’s a hard thing to do when the language is evolving this quickly. With Zig 0.16 almost upon us, and with breaking standard library changes already landing, it felt like a good time to revisit some real code.

Zig 0.16 is inevitable

I had a small Zig program that performed DNS lookups using std.net in Zig 0.15, and I decided to port it to Zig 0.16 to see how the new IO model and std.Io.net APIs actually behave in practice. This was my Zig 0.15 code:

 1const std = @import("std");
 2
 3pub fn main() !void {
 4    var gpa = std.heap.DebugAllocator(.{}){};
 5    defer std.debug.assert(gpa.deinit() == .ok);
 6    const allocator = gpa.allocator();
 7
 8    const hostname = "sheran.sg";
 9
10    const result = try std.net.getAddressList(allocator, hostname, 0);
11    defer result.deinit(); 
12    for(result.addrs) |r| {
13        var writer: std.Io.Writer.Allocating = .init(allocator);
14        defer writer.deinit();
15        try r.format(&writer.writer);
16        const ip = try writer.toOwnedSlice();
17        defer allocator.free(ip);
18        std.debug.print("Ip: {s}\n",.{ip});
19    } 
20}

Fairly straightforward code. It does a DNS lookup and collects all IP addresses that belong to the hostname and prints it to stdout. This includes IP4 and IP6 addresses.

In Zig 0.16, std.net is gone. Well, not gone, but moved. This new change has also broken zig std. The move of std.net makes sense. Since net related functions were clearly IO, it has appropriately been moved to std.Io.net. Ok so how do we now do a lookup of a hostname? Ah here we are: std.Io.net.HostName.lookup()

I looked at the Zig manual (read: source-code) and naively coded this up:

 1const std = @import("std");
 2const Io = std.Io;
 3
 4
 5pub fn main() !void {
 6    var gpa = std.heap.DebugAllocator(.{}){};
 7    defer std.debug.assert(gpa.deinit() == .ok);
 8    const allocator = gpa.allocator();
 9
10    const hostname: Io.net.HostName = try .init("sheran.sg");
11
12    var threaded: Io.Threaded = .init(allocator);
13    defer threaded.deinit();
14    const io = threaded.io();
15
16    var elem_buf: [16]Io.net.HostName.LookupResult = undefined;
17    var queue: Io.Queue(Io.net.HostName.LookupResult) = .init(&elem_buf);
18    var cname_buf: [Io.net.HostName.max_len]u8 = undefined;
19
20    Io.net.HostName.lookup(hostname, io, &queue, .{
21        .port = 0,
22        .canonical_name_buffer = &cname_buf,
23    });
24
25    while(queue.getOne(io)) |result| {
26       switch(result) {
27            .address => {
28                std.debug.print("{any}\n",.{result});
29            },
30            .canonical_name => {},
31            .end => |e| {
32                return e;
33            },
34        } 
35    } else |_| {}
36}

I ran it and it worked first time! I’m obviously a genius. Now, the more eagle-eyed of you may ask, “Why did you pick 16 as your elem_buf size?” (line 16) Good question. I used 16 because the source code said: Guaranteed not to block if provided queue has capacity at least 16. Then I had this nagging feeling, “Ok but what if there were 17 IPs in a DNS lookup?”

I tested that hypothesis by adding 17 A records to a domain name I owned, ran the code and it hangs. Deadlocked. Quickly falling off that Dunning-Kruger peak now. The deadlock happens because the lookup() is trying to add a 17th element to the queue and my consumer hasn’t even started so that it can pull the elements off the queue. So lookup() waits for the queue to empty and won’t run the consumer code below. (line 25…)

Incidentally, I also checked the RFC for DNS and technically, there’s no limit to the number of A records that each domain name can have, which made me think that there could be an interesting fuzz test for code that does DNS resolutions. But that’s for another post.

Then I remembered Andrew’s talk at Zigtoberfest ‘25. He spoke about the use of both async AND concurrent in the new Zig 0.16 release. My use case looked very much like it needed io.concurrent() so I tried that and it worked!:

 1const std = @import("std");
 2const Io = std.Io;
 3
 4pub fn main() !void {
 5    var gpa = std.heap.DebugAllocator(.{}){};
 6    defer std.debug.assert(gpa.deinit() == .ok);
 7    const allocator = gpa.allocator();
 8
 9    var threaded: Io.Threaded = .init(allocator);
10    defer threaded.deinit();
11    const io = threaded.io();
12
13    const hostname = try Io.net.HostName.init("sheran.sg");
14
15    var elem_buf: [2]Io.net.HostName.LookupResult = undefined;
16    var queue: Io.Queue(Io.net.HostName.LookupResult) = .init(&elem_buf);
17    var name_buf: [Io.net.HostName.max_len]u8 = undefined;
18    
19    var lookup = try io.concurrent(Io.net.HostName.lookup, .{ hostname, io, &queue, .{
20        .port = 0,
21        .canonical_name_buffer = &name_buf,
22    } });
23    defer lookup.cancel(io);
24
25    while(queue.getOne(io)) |res| {
26       switch(res) {
27            .address => {
28                var writer: std.Io.Writer.Allocating = .init(allocator);
29                try res.address.format(&writer.writer);
30                const ip_port = try writer.toOwnedSlice();
31                defer allocator.free(ip_port);
32
33                std.debug.print("Ip {s}\n",.{ip_port});
34            },
35            .canonical_name => {
36                std.debug.print("Cname {s}\n",.{res.canonical_name.bytes});
37            },
38            .end => |e|  {
39               return e;
40            },
41        }
42
43    } else |_| {}
44}

What’s interesting here is that now, the elem_buf buffer becomes optional and you can create the queue like .init(&.{}) (line 16) and it will still work. With this new code the lookup will kick off concurrently and the consumer loop will start immediately after. It no longer has a problem with handling more than the buffer size allocated because the consumer will pull elements off the queue. Zig’s std.Io.Queue works very much like Go channels. Actually in his ‘Don’t forget to flush’ talk (at 17:41), Andrew makes a comparison with Go.

To take it a step further, I decided to test the Zig async vs concurrent by running my code like this: zig run dnslookup.zig -fsingle-threaded which complained correctly that ConcurrencyUnavailable. I then changed my io.concurrent() to io.async() and tried it in single threaded mode again and this time there was an ominous looking error that when traced, crashed because of deadlocking. That’s because you can’t safely run both the lookup and the consumer on the same thread when the lookup can block.


zig