Mononym harness several features unique to Rust and have a simpler implementation of named values compared to existing implementations in languages like Haskell.
At its core Mononym makes use of the idea that the use of impl Trait
in return position produces a new abstract type that is unique to that function.
Consider the following example:
trait Name {}
impl Name for () {}
fn foo() -> impl Name {}
fn bar() -> impl Name {}
fn same<T>(_: T, _: T) {}
same(foo(), foo());
same(bar(), bar());
same(foo(), bar()); // error
same(foo(), ()); // error
same(bar(), ()); // error
We define a dummy trait Name
that is implemented only for ()
. We then define two functions foo()
and bar()
that return impl Name
by returning ()
. Although both foo()
and bar()
are in effect returning the same concrete type ()
, they are considered different types by the Rust compiler.
To test whether the types of two values are equal, we define the function same<T>(_: T, _: T)
that pass the compilation check if two values have the same type. When we try to compile the code, we will find that both same(foo(), foo())
and same(bar(), bar())
pass the compilation, but same(foo(), bar())
would result in a type error. Similarly, the check on same(foo(), ())
fails, as Rust treats the returned impl Name
different than the underlying type ()
.
The use of impl Name
in return position provides the first step toward defining unique names for each value. However the above code shows an obvious issue of using returned impl Name
as unique type, as the test same(foo(), foo())
pass the compilation. This means that two calls to the same function returning impl Trait
will return values of the same opaque type. If we want the name type to be truly unique, the test same(foo(), foo())
should have failed.
We can try to force the opaque type returned in a function returning impl Name
unique by making the function generic:
trait Name {}
impl Name for () {}
fn type_name<T>(_: T) -> impl Name {}
fn same<T>(_: T, _: T) {}
same(type_name(()), type_name(()));
same(type_name(0_u64), type_name(0_u64));
same(type_name(()), type_name(0_u64)); // error
same(type_name(0_u32), type_name(0_u64)); // error
same(type_name(||{}), type_name(||{})) // error
In the above example, we define a type_name
function that accepts a dummy value of any generic type, and return an impl Name
. From there we can see that when type_name
is called with the same type, the returned impl Name
is considered the same type. As a result, the tests to same(type_name(()), type_name(()))
and same(type_name(0_u64), type_name(0_u64))
pass the compilation, but tests such as same(type_name(()), type_name(0_u64))
fail the compilation.
In the last test, we also see that the compilation test for same(type_name(||{}), type_name(||{}))
has failed. This is because each new closure expression in Rust produces a unique anonymous type. With this, we know that as long as we are providing different closure expressions, the returned impl Name
would be considered different type by Rust.
Moving one step further, we can instead define the implementation for Name
using the anonymous closure type:
use core::marker::PhantomData;
trait Name {}
struct SomeName<N>(PhantomData<N>);
impl <N> Name for SomeName<N> {}
fn unsafe_new_name<N>(_: N) -> impl Name {
SomeName::<N>(PhantomData)
}
fn same<T>(_: T, _: T) {}
let f = ||{};
same(unsafe_new_name(f), unsafe_new_name(f));
fn foo() -> impl Name {
unsafe_new_name(||{})
}
same(foo(), foo())
same(unsafe_new_name(||{}), unsafe_new_name(||{})); // error
We define a struct SomeName<N>
that implements Name
based on the unique type N
. We then rename our name generator function to unsafe_new_name
and make it return SomeName
based on the given generic type N
. We keep the SomeName
struct and unsafe_new_name
private, so that external users cannot create new impl Name
that may violate the uniqueness guarantee.
We can see that at this stage the name creation is still unsafe, as user can still bind the closure expression to a variable like let f = ||{}
, and then the resulting test same(unsafe_new_name(f), unsafe_new_name(f))
would pass. Similarly, if we define a function like foo() -> impl Name
that calls unsafe_new_name(||{})
inside the function, the resulting test same(foo(), foo())
would still pass the compilation.
To guarantee that every opaque type behind an impl Name
is unique, we need to ensure that not only the inner function that returns impl Name
is generic, but also all possible functions that return impl Name
must be generic as well.
While it is possible to ask end users to provide a unique type using ||{}
at each function definition and and call, doing so can be error prone and confusing. Instead, we want to define a name seed that is a unique type itself, and use it to generate new names.
We implement the Seed
type as follows:
use core::marker::PhantomData;
pub trait Name {}
struct SomeName<N>(PhantomData<N>);
impl <N> Name for SomeName<N> {}
pub struct Seed<N>(PhantomData<N>);
impl <N: Name> Seed<N> {
pub fn new_name(self) -> impl Name
{
SomeName::<N>(PhantomData)
}
pub fn replicate(self) -> (impl Seed, impl Seed)
{
(Seed(PhantomData::<N>), Seed(PhantomData::<N>))
}
}
fn test(seed: impl Seed) {
fn same<T>(_: T, _: T) {}
let (seed1, seed2) = seed.replicate();
same(seed1, seed2); // error
same(seed1.new_name(), seed2.new_name()); // error
}
The type Seed<N>
is parameterized by a name N
and provides two methods. The first method new_name(self)
consumes the seed and returns a new impl Name
(notice the lack of &
in self
). Since the seed is consumed when generating the name, the same seed cannot be used to generate another new name of the same type.
Although the seed is consumed during name generation, the second method replicate(self)
consumes the original seed, and returns two new seeds with unique names in the form of impl Seed
. Notice that the two impl Seed
returned by replicate
are considered different types by Rust, even when they have the same underlying concrete type Seed<N>
. By calling replicate
one or more times, we will be able to generate multiple names with unique types.
Since there is no public constructor function to create new Seed
value, the only way external users can create new seed is by replicating existing seeds. In this way, external functions would have to always accept impl Seed
as an argument somewhere along the function calls to be able to generate new names.
By treating our test
function as an external function, it is forced to accept a impl Seed
in order to generate new impl Name
s. We first use seed.replicate()
to create two new seeds seed1
and seed2
. When we compile the code, we can find out that the test same(seed1, seed2)
fails, indicating that the two replicated seeds have different types. Similarly, the test same(seed1.new_name(), seed2.new_name())
fails because the two names are generated from different seeds. It is also not possible to do something like same(seed.new_name(), seed.new_name())
, because the affine type system of Rust consumes the seed during name generation and to not allow the seed to be reused.
The Seed
type we defined earlier provides a new_name
method that returns unique impl Name
. While having a unique name is not very useful on its own, it can be used to define a Named<Name, T>
struct to assign unique names to a given value of type T
. The Named
struct is defined and used as follows:
use core::marker::PhantomData;
pub struct Named<Name, T>(T, PhantomData<Name>);
impl <Name, T> Named<Name, T> {
pub fn value(&self) -> &T { &self.0 }
pub fn into_value(self) -> T { self.0 }
}
pub trait Name {}
pub struct Seed<N>(PhantomData<N>);
impl <N: Name> Seed<N> {
pub fn new_named<T>(self, value: T) -> Named<impl Name, T>
{
Named::<N, _>(value, PhantomData)
}
pub fn replicate(self) -> (impl Seed, impl Seed)
{
(Seed(PhantomData::<N>), Seed(PhantomData::<N>))
}
}
fn test(seed: impl Seed) {
fn same<T>(_: T, _: T) {}
let (seed1, seed2) = seed.replicate();
same(seed1.new_named(1), seed2.new_named(1)); // error
}
The struct Named<Name, T>
is essentially a newtype wrapper around T
, with the underlying value kept private. The Named
type provides two public methods, value
for getting a reference to the underlying value, and into_value
to convert the named value to the underlying value.
The Seed
type now provides a new_named
method that accepts an owned value of type T
, and returns a Named<impl Name, T>
. Because the impl Name
is nested inside Named
, we can guarantee that the new name given to the value is unique, provided that the Seed
type is unique.
Similar to earlier, we can test that two named values indeed have different names by writing a test
function that accepts a impl Seed
. After replicating the seed, we can verify that the test same(seed1.new_named(1), seed2.new_named(1))
fails with error during compilation. This shows that the Named<impl Name, i32>
vallues returned by the two calls to new_name
are indeed unique.
We can think of the type Named<Name, T>
as being a singleton type, that is, a type with only one possible value. With Rust's affine type system, the singleton guarantee is even stronger that we can never have two Rust values of type Named<Name, T>
with the same Name
type.
Our setup for generating uniquely named values is mostly complete, provided we are able to hand over the first unique Seed
value to the main function to start generating new names. But we cannot simply expose a function like fn new_seed() -> impl Seed
, as we know that two calls to the same function will return two values of the same type, thereby making them non-unique.
We know that in languages like Haskell, it is possible to generate unique types by using continuation-passing-style with higher-ranked continuations. While Rust do not currently support higher-ranked types, it instead supports higher-ranked trait bounds (HRTB) which can be used in similar way.
use core::marker::PhantomData;
pub trait Name {}
pub struct Life<'name>(PhantomData<*mut &'name ()>);
impl<'name> Name for Life<'name> {}
pub fn with_seed<R>(
cont: impl for<'name> FnOnce(Seed<Life<'name>>) -> R
) -> R {
cont(Seed(PhantomData))
}
pub struct Named<Name, T>(T, PhantomData<Name>);
impl <Name, T> Named<Name, T> {
pub fn value(&self) -> &T { &self.0 }
pub fn into_value(self) -> T { self.0 }
}
pub struct Seed<N>(PhantomData<N>);
impl <N: Name> Seed<N> {
pub fn new_named<T>(self, value: T) -> Named<impl Name, T>
{
Named::<N, _>(value, PhantomData)
}
pub fn replicate(self) -> (impl Seed, impl Seed)
{
(Seed(PhantomData::<N>), Seed(PhantomData::<N>))
}
}
fn same<T>(_: T, _: T) {}
with_seed(|seed| {
let (seed1, seed2) = seed.replicate();
same(seed1, seed2); // error
same(seed1.new_named(1), seed2.new_named(1)); // error
});
with_seed(|seed1| {
with_seed(|seed2| {
same(seed1, seed2); // error
same(seed1.new_named(1), seed2.new_named(1)); // error
});
});
We first come out with a different way of generating unique types, using the Life
type that is parameterized by a unique lifetime. The struct is defined as struct Life<'name>(PhantomData<*mut &'name ()>)
. The inner type PhantomData<*mut &'name ()>
makes Rust treats Life<'name>
as if it is a raw pointer of type *mut &'name ()
. We specifically make it so that Rust treats 'name
as an invariant phantom lifetime. This means that if we have two types Life<'name1>
and Life<'name2>
, Rust would consider them as different types even if there are partial overlaps such as 'name1: 'name2
.
Using Life
, we now simplify the problem of generating unique names to generating unique lifetimes. We then define the with_seed
function, which accepts a continuation with a higher-ranked trait bound impl for<'name> FnOnce(Seed<Life<'name>>) -> R
. The for<'name>
part forces the contnuation closure to work with all possible lifetimes. As a result, we can guarantee that the type Life<'name>
is always unique inside the closure. By using Life<'name>
as the unique type inside Seed<Life<'name>>
, we ensure that the seed type given to the continuation closure is also unique.
We can now repeat the same test we had earlier, but now with the tests running inside the closure given to with_seed
. We can verify that after replicating the seed, the tests same(seed1, seed2)
and same(seed1.new_named(1), seed2.new_named(1))
still fail with compilation error, indicating that the names generated are different.
We can also repeat the same test with two nested calls to with_seed
, thereby getting two separate fresh seeds seed1
and seed2
. Thanks to the magic of HRTB and invariant phantom lifetime, we can verify that even in this case the test same(seed1, seed2)
still fails, indicating that Rust is treating the two underlying lifetimes differently. Similarly, the test same(seed1.new_named(1), seed2.new_named(1))
also fails, indicating that names generated by two different seeds are indeed different.
The techniques of using phantom lifetimes as names and using HRTB to generate new lifetimes is first explored in GhostCell. With that, we close the loop of unique name generation by requiring the top level program to generate the first seed using with_seed
. Furthermore, since multiple calls to with_seed
is safe, this means that there is no way for external users to do unsafe construction of two seeds or named values of the same type. From this, we can safely reason that assuming unsafe Rust is not used, whenever we get a value of type Named<Name, T>
, the type Name
is always uniquely assigned to the underlying value.