Fluent Rust - Checking Trait Implementation

Migo·2023년 10월 15일
0

Fluent Rust

목록 보기
11/23
post-thumbnail

Caveat: Some of content in this article may catch you off guard.

Rust has powerful, rich type system that guarantees memory-safety and thread-safety.

Everything comes with cost, however.

As tempting as Rust's type system may sound, it doesn't mean that it offers the best ergonomics we want. For example, runtime reflection in Rust is quite unergonomic. For example, the following is a Python code example to see if a certain class implements the parent class:

# Python code
class A:
	...
class B(A):
	...
    
def main():
	a:B = B()
    assert isinstance(a,A) # one liner!

Let's see Rust equivalent:

// Rust code
macro_rules! is_trait {
    ($name:ty, $trait_name:path) => {{
        trait __InnerMarkerTrait {
            fn __is_trait_inner_method() -> bool {
                false
            }
        }
        struct __TraitTest<T>(T);
        impl<T: $trait_name> __TraitTest<T> {
            fn __is_trait_inner_method() -> bool {
                true
            }
        }
        impl<T> __InnerMarkerTrait for __TraitTest<T> {}
        __TraitTest::<$name>::__is_trait_inner_method()
    }};
}

trait A {} 
struct B;
impl A for B {}

fn main() {
    assert!(is_trait!(B, A))
}

Big, scary, off-putting, isn't it?
To be completely fair, Rust has no notion of Class.
Some of you may think that Struct is just the same as Class in other languages and Trait is Rust equivalent of interface, but that's not quite accurate.

One of the reasons why you can do runtime reflection so easily in Python, for example, is that Python classes come with its own metadata that includes base class and method resolution order. Let's see Python code again:

# Python code
class A:
	...
class B(A):
	...

def main():
	assert B.__bases__ == (A,) #true
    assert B.__mro__ == (<class '__main__.B'>, <class '__main__.A'>, <class 'object'>)

In fact, the moment you declare a class, you essentially create a bundle of methods as follows:

B.__abstractmethods__ B.__delattr__(        B.__format__(         B.__init_subclass__(  B.__name__            B.__reduce_ex__(      B.__subclasses__()   
B.__annotations__     B.__dict__            B.__ge__(             B.__instancecheck__(  B.__ne__(             B.__repr__(           B.__subclasshook__(  
B.__base__()          B.__dictoffset__      B.__getattribute__(   B.__itemsize__        B.__new__(            B.__ror__(            B.__text_signature__ 
B.__bases__           B.__dir__(            B.__getstate__(       B.__le__(             B.__or__(             B.__setattr__(        B.__weakref__        
B.__basicsize__       B.__doc__             B.__gt__(             B.__lt__(             B.__prepare__(        B.__sizeof__(         B.__weakrefoffset__  
B.__call__(           B.__eq__(             B.__hash__(           B.__module__          B.__qualname__        B.__str__(           
B.__class__(          B.__flags__           B.__init__(           B.__mro__             B.__reduce__(         B.__subclasscheck__(

And it comes down to the question, why Rust doesn't provide such functionality?

Well, it's not a fault but a design decision. When there is a cost, Rust wants you to make it explicit.

Unlike Python or other languages that engage in some type of meta programming, a practice of writing code that writes other code, Rust seems quite conservertive.

HOWEVER, that shouldn't mean that we as developers do not have a way to add ergonomics in the language.



You may find that Rust example above was quite contrived because we don't usually need to see if a type implements a certain interface.

That's correct! In 99 out of 100 cases we encounter situations like:

I have an instance that could be Any, and I want to know if I could downcast it to a certain type

The following shows that use case:

trait A {}
struct B;
impl A for B {}

// Runtime reflection
fn downcaster(arg: Box<dyn std::any::Any>) -> impl A {
    *arg.downcast::<B>().unwrap() 
}

This itself comes with the cost of:

  • allocating things in heap memory(as it uses Box)
  • v-lookup caused by using dyn keyword.
  • handling error cases when the downcast is not possible

Most critically you have to make function per each type you want to cast into.

fn is_downcatable_to_b(arg: Box<dyn std::any::Any>) -> bool {
	let Some(_) = arg.downcast_ref::<B>() else {
        return false;
    };
    true
}

fn is_downcatable_to_c(arg: Box<dyn std::any::Any>) -> bool {
	let Some(_) = arg.downcast_ref::<C>() else {
        return false;
    };
    true
}

This is not only poor design, but also quite useless because it doesn't provide:

  • checking logic if passed argument implements trait.
  • reusability of function.



Challenge: Trait checking - if B is type that implements trait A

What we want is :

  • Checking logic to see if an instance is of type that implements certain trait.
  • if the condition is met, we want to pass them onto different function that consumes argument of the type.

Example: DDD

Suppose you are practicing domain-driven design and you want to have your Repository accepts Aggregate mapped to itself:

trait Aggregate {}

trait Repository {
    type Aggregate: Aggregate;

    fn add(&self, aggregate: &mut Self::Aggregate);
    fn update(&self, aggregate: &mut Self::Aggregate);
}

Here, what you want to add is, event_hook macro which accepts argument of type that implements Aggregate, taking events from them, passing them onto event handlers.

So the entire picture will become:

trait Repository {
    type Aggregate: Aggregate;

    fn add(&self, aggregate: &mut Self::Aggregate);
	
    //opt-in macro
    #[event_hook]
    fn update(&self, aggregate: &mut Self::Aggregate);
       
    fn event_hook(&mut self, aggregate: &mut A) {
        println!("event hook called! {:?}", aggregate)
    }
}
}



Why do we need this in the first place?

When you write macros for library or framework, the codes should always be application-agnostic, meaning that code in macro cannot assume whether you will pass only arguments of a type that implements certain trait or something else altogether.

Suppose both Repository and Aggregate are declared in framework and you don't want your client to be aware of how to invoke event_hook method. You just want client to be aware of event_hook handler.

So, it means that client can opt in event_hook macro as they wish. The issue, however, is that client may pass a type that implements Aggregate along with with some other argument as follows:

// client implementation of Aggregate and Repository
struct OrderAggregate;
struct OrderRepository;

// Repository trait has its own `abstract` method so there is no change for client to add other methods. 
impl Repository for OrderRepository {
    type Aggregate = OrderAggregate;

    fn add(&self, aggregate: &mut Self::Aggregate) {
        todo!()
    }
    fn update(&self, aggregate: &mut Self::Aggregate) {
        todo!()
    }
}


// But then in Rust, client can attach `inherent` implementation for a struct
impl OrderRepository {
	#[event_hook] // event hook used here!
    fn add_agg_with_args(&self, aggregate: &mut Self::Aggregate, name: String) {
        todo!()
    }
}

At time point, you may have already assumed that the implementation of event_hook macro should be:

  • generating codes reactively against arguments passed.
  • checking if argument is type that implements certain trait(in this example, Aggregate) and pass them to default method(event_hook.)
  • ignoring argument handling logic if that's type which doesn't implement Aggregate



End Result we will have when expanded:

So, with event_hook macro expanding application code, that will be something along the lines of:


impl OrderRepository {
	//#[event_hook]
    fn add_agg_with_args(&mut self, aggregate: &mut OrderAggregate, name: String) {
        trait IsAggregateNotImplemented {
            const IS_AGGREGATE: bool = false;
            
            // The following takes Any type and return Any type T depending on the context!
            fn get_aggregate<T>(_: impl std::any::Any) -> &'static mut T {
                unreachable!()
            }
        }
        impl<T> IsAggregateNotImplemented for T {}
        struct IsAggregate<T>(::core::marker::PhantomData<T>);
        
        #[allow(unused)]
        impl<T: Aggregate> IsAggregate<T> {
            const IS_AGGREGATE: bool = true;
            fn get_aggregate(data: &mut T) -> &mut T {
                data
            }
        }

        if <IsAggregate<OrderAggregate>>::IS_AGGREGATE {
            self.event_hook(<IsAggregate<OrderAggregate>>::get_aggregate(aggregate));
        }
        if <IsAggregate<String>>::IS_AGGREGATE {
            self.event_hook(<IsAggregate<String>>::get_aggregate(name)); // this is unreachable
        }
        
        // main logic follows..
    }
}



Here is some key notes:

1.blanket implementation:

trait IsAggregateNotImplemented {
	const IS_AGGREGATE: bool = false;
            
    // The following takes Any type and return Any type T depending on the context!
    fn get_aggregate<T>(_: impl std::any::Any) -> &'static mut T {
		unreachable!()
	}
}    

This is to implement get_aggregate and its contst IS_AGGREGATE so when it is accessed by any type, it gives a certain result. Note that get_aggregate gets Any type and return inferred type T. At the end of the day, this method shouldn't be accessed otherwise it will error out by unreachable!(). I'll elaborate more on it later.



2.specialization

struct IsAggregate<T>(::core::marker::PhantomData<T>);
        
#[allow(unused)]
impl<T: Aggregate> IsAggregate<T> {
	const IS_AGGREGATE: bool = true;
	fn get_aggregate(data: &mut T) -> &mut T {
		data
	}
}

Here, any type that implements Aggregate gets specialized, starting to have IS_AGGREGATE which is set true and get_aggregate which returns passed argument itself.



3.type inference

if <IsAggregate<OrderAggregate>>::IS_AGGREGATE {
    self.event_hook(<IsAggregate<OrderAggregate>>::get_aggregate(aggregate));
        }
if <IsAggregate<String>>::IS_AGGREGATE {
	self.event_hook(<IsAggregate<String>>::get_aggregate(name)); // this is unreachable
        }

Here, as briefly touhed on in blanket implementation, the return type of get_aggregate blanket implementation was &'staic mut T which is essentially undefined. It is self.event_hook which can tell the actual type of &'static mut T which is, in this context, &mut OrderAggregate.

What's astonishing about this approach is that you can pass any types that may or may not implement Aggregate to method annotated with your custom macro that writes trait checking code.

In the code generated by macro, they TRY to pass the argument to self.event_hook which only accepts type that implements Aggregate.

So while allowing for type safety, you can successfully get what macro has to offer - don't repeat yourself. Of course that comes with cost. But it is explicit anyway.

If you are interested in actual event_hook macro implementation, check out ruva.

Ruva is a framework written in Rust for event-driven architecture with domain driven design practice.

profile
Dude with existential crisis

0개의 댓글