Type Converters in Rust
Advent of Code 2023 has just kicked off, and I'm going to try something a bit different this year, I'm going to try and share useful concepts and patterns that play a role in solving each day's puzzle.
Today, we're looking at how type converters work in Rust, and how you can use them to create intuitive interfaces for your types.
What are Type Converters?
Type converters are present in most languages, they're a means of converting one type into another. This is extremely common when converting between primitive types like integers and floating point numbers, where the traditional C-style syntax would be something like: (int)floatValue
.
Different languages have different syntaxes for this, but the concept is the same: take a value of one type, and convert it to a value of another type. From a function signature perspective, we're looking for fn convert(from: T) -> U
, where T
and U
are different types.
Types of Type Converters
This kind of conversion is so common in practical code that instead of having each type implement its own conversion interface, most languages provide a standard means of doing so. This sounds great, until you realize that in the world of type conversions, one size really doesn't fit all.
Let's take C# for example:
object input = 42.5;
// Here we expect to do a fast "cast" from the input to an integer.
// If it fails, we expect to get a TypeCastException.
var intValue = (int)input;
// Here we expect to get null if the conversion fails, but it should
// still be a fast operation.
var maybeIntValue = input as int;
// Here we expect to run some kind of parsing logic across the input
// (i.e. it's likely to be more costly and could fail with an exception).
Convert.ToInt32(input);
Each of these patterns conveys meaning to the reader, and each of them has its own set of tradeoffs which the author of the code needs to consider before using them. As such, having different syntaxes for each of them is actually quite useful. In general, we can break down type conversions into three categories:
- Fast, Safe, Casts: These are used when the conversion is expected to be fast, and the conversion is expected to succeed. If the conversion fails, it should throw an exception or panic (but this should be rare/never happen).
- Fast, Maybe Casts: These are used when the conversion is expected to be fast, and the conversion is expected to fail. If the conversion fails, it should return an indication of the failure (i.e.
null
orNone
). - Slow, Parsing: These are used when the conversion is expected to be slow, and the conversion is expected to fail. If the conversion fails, it should surface detailed information about why it failed (i.e. an exception with a detailed message, or a
Err
in Rust).
In the Rust world, these look like the following:
// Fast, Safe, Casts
let input = 42.5 as i32;
// or using the `From`/`Into` traits
let value = input.into();
// Fast, Maybe Casts
let value = input.try_into()?;
// Slow, Parsing
let value = input.parse::<i32>()?;
Type Converter APIs
A challenge of language design when it comes to type coercion is the question of how best to satisfy the "Open-Closed Principle" (O in SOLID). In other words, how do you allow developers to introduce type converters without needing to modify types that they don't own?
Let's take the example of me wanting to implement a DateTimeLocation
type which wraps a DateTime
and a Location
(i.e. a latitude and longitude). I'd like to be able to easily convert bi-directionally between a DateTimeLocation
and a DateTime
or Location
, but I don't want to have to modify the DateTime
or Location
types to do so.
If your language doesn't have support for bi-directional type converters, you're going to need to implement two different APIs: a forward converter and a reverse converter. This works, but it really isn't the nicest API to use since it requires that people using your API switch to calling methods on your type rather than relying on the language's built-in type conversion syntax.
class DateTimeLocation:
def __init__(self, date_time: datetime, location: Location = None):
self.date_time = date_time
self.location = location
def to_date_time(self) -> datetime:
return self.date_time
def to_location(self) -> Location:
return self.location
my_date_time = DateTimeLocation(datetime.now(), Location(36.12, -86.67))
# I can't do this (which would be the normal Pythonic way to convert between types)
# because I can't modify the `datetime` type.
date = datetime(my_date_time)
# Instead, I need to do this:
date = my_date_time.to_date_time()
This is where Rust comes in, providing a default implementation of the U: Into<T>
trait for any type which implements T: From<U>
. This means that if I implement Into<Date> for DateTimeLocation
, anyone can convert a DateTimeLocation
into a Date
by calling date_time_location.into()
or by using Date::from(date_time_location)
.
This intrinsic bi-directionality means that I can easily extend the type conversion semantics of types I don't control, so long as I control the type on one side of the conversion. C# has a similar ability (though it does require explicitly defining the conversion methods for U -> T
and T -> U
on your type T
).
Implementing Type Converters in Rust
Let's take the example of implementing a new type converter in Rust, in this case to parse a simple file format. Let's imagine that we have a file which looks like the following:
# <id>: <value>, <value>, <value>, ...
1: 1, 2, 3, 4, 5
2: 6, 7, 8, 9, 10
We'd like to parse this into a series of Record
s, where each record has an id
corresponding to the number at the start of the line, and a series of values
corresponding to the numbers after the colon.
struct Record {
id: u32,
values: Vec<u32>,
}
Using the From Trait
The simplest way to implement this would be to use the From
trait, which allows us to implement a conversion from one type to another. In this case, we'd like to implement From<String> for Record
, which would allow us to convert a String
into a Record
.
tip
If we use From<String>
then we need to pass around ownership over the String
itself, which means that we'll need to clone it if we want to keep a copy of it. We can instead implement From<T>
for all types which implement AsRef<str>
(i.e. &str
, String
, etc) and then use &str
as our input type. This means that we can pass around references to the original string, and we don't need to clone it.
impl<T> From<T> for Record
where
T: AsRef<str>,
{
fn from(input: T) -> Self {
let input = input.as_ref();
let (id, values) = input.split_once(':').unwrap();
let id = id.trim().parse::<u32>().unwrap();
let values = values
.split(',')
.map(|v| v.trim().parse::<u32>().unwrap())
.collect();
Self { id, values }
}
}
This allow us to construct our list of records by doing the following:
let records = input
.lines()
// We filter out any empty lines or comments
.filter(|l| !l.is_empty() && !l.starts_with('#'))
// And then convert each line into a Record
.map(Record::from)
.collect::<Vec<_>>();
This is great, but you'll notice that we're using a lot of .unwrap()
calls in our From
implementation. In general, .unwrap()
is a pretty risky thing to do as it'll result in a panic!()
when something goes wrong. Instead, we'd prefer to have the ability to handle those errors gracefully...
Using the TryFrom Trait
This is where the TryFrom
trait comes in. This is similar to the From
trait, but instead of always returning a value, it returns a Result
which can either be the successfully converted value, or an error.
This allows us to propagate errors up to the caller, and handle them gracefully without crashing the process.
impl<T> TryFrom<T> for Record
where
T: AsRef<str>,
{
type Error = Box<dyn std::error::Error>;
fn try_from(input: T) -> Result<Self, Self::Error> {
let input = input.as_ref();
let (id, values_str) = input.split_once(':')
.ok_or(format!("Invalid format, expected '<id>: <value>, ...'."))?;
let id = id.trim().parse::<u32>()
.map_err(|err| format!("Invalid ID: {err}"))?;
let values = Vec::new();
for value in values_str.split(',') {
let value = value.trim().parse::<u32>()
.map_err(|err| format!("Invalid value: {err}"))?;
values.push(value);
}
Ok(Self { id, values })
}
}
Now, when we go to use this, we can handle these errors a bit more gracefully:
let records = input
.lines()
// We filter out any empty lines or comments
.filter(|l| !l.is_empty() && !l.starts_with('#'))
// And then convert each line into a Record
.map(|line| Record::try_from(line).unwrap_or_else(|err| {
eprintln!("Failed to parse line: {err}");
// Oh, did I say gracefully? I meant just kill the process...
std::process::exit(1);
}))
.collect::<Vec<_>>();
Now this is all well and good, but you might have noticed that we're using .try_from()
while inside our implementation we're using .parse()
to try to convert string values into numbers. What's up with that?
Using the FromStr Trait
It turns out that parsing strings into other types is a common enough pattern that Rust gives it its own special treatment. Any type which implements the FromStr
trait can be parsed from a string using the .parse()
method. It works almost identically to the TryFrom
trait, except that it's implemented for &str
rather than T
.
In practice, this simplifies the type signature for our implementation a little bit, and also means that someone reading our code can more easily see that we're doing some string parsing.
impl FromStr for Record {
type Err = Box<dyn std::error::Error>;
fn from_str(input: &str) -> Result<Self, Self::Err> {
let (id, values_str) = input.split_once(':')
.ok_or(format!("Invalid format, expected '<id>: <value>, ...'."))?;
let id = id.trim().parse::<u32>()
.map_err(|err| format!("Invalid ID: {err}"))?;
let values = Vec::new();
for value in values_str.split(',') {
let value = value.trim().parse::<u32>()
.map_err(|err| format!("Invalid value: {err}"))?;
values.push(value);
}
Ok(Self { id, values })
}
}
All this allows us to write our code like this:
let records = input
.lines()
// We filter out any empty lines or comments
.filter(|l| !l.is_empty() && !l.starts_with('#'))
// And then convert each line into a Record
.map(|line| line.parse().unwrap_or_else(|err| {
eprintln!("Failed to parse line: {err}");
// Oh, did I say gracefully? I meant just kill the process...
std::process::exit(1);
}))
.collect::<Vec<_>>();
Pairing Type Converters with Iterators
As a last bit of fun, let's look at how you can pair what we covered yesterday in Iterators in Rust with type converters to be able to easily consume a stream of data and convert it into our desired types.
struct RecordIterator<'a> {
value: &'a str,
}
impl<'a> Iterator for RecordIterator<'a> {
// We have our iterator return a Result, so that we can propagate errors
type Item = Result<Record, Box<dyn std::error::Error>>;
fn next(&mut self) -> Option<Self::Item> {
if let Some((record, rest)) = self.value.split_once('\n') {
self.value = rest;
Some(record.parse())
}
None
}
}
And now we can use this iterator to parse our records:
for record in RecordIterator { ... } {
match record {
Ok(record) => { ... },
Err(err) => { ... },
}
}
// Or if we want to collect them into a Vec and discard anything that failed to parse:
let records = RecordIterator { ... }
.filter_map(Result::ok)
.collect::<Vec<_>>();
Conclusion
Type converters are a great way to encapsulate the logic for converting between different types, and there are several different ways to implement them in Rust. The From
trait is great for fast, safe, casts, the TryFrom
trait is great for fast, maybe casts, and the FromStr
trait is great for slow parsing from strings.
Pairing these with iterators allows us to easily consume streams of data and convert them into our desired types, and using FromStr
lets us handle errors gracefully (which saves everyone a ton of time when things go wrong).
Benjamin Pannell
Site Reliability Engineer, Microsoft
Dublin, Ireland