2022年10月

上一节我们实现了虚表的定义及初始化操作，接下来要实现虚函数。回顾一下我们的类定义，及展开后的代码：

#[class]
pub struct Base
{
    ...
    virtual fn func1(&self) -> i32 { this.x }
    virtual fn func2(&self, i: i32) -> i32 { this.y + i }
}
// 展开后的虚表及虚函数实现
#[repr(C)]
pub struct BaseVTable
{
    func1: fn(this: &Base) -> i32,
    func2: fn(this: &Base, i: i32) -> i32,
}
impl Base
{
    ...
    fn func1_impl(this: &Base) -> i32 { this.data.x }
    fn func2_impl(this: &Base, i: i32) -> i32 { this.data.y + i }
    pub fn func1(&self) -> i32 { (self.vptr.func1)(self) }
    pub fn func2(&self, i: i32) -> i32 { (self.vptr.func2)(self, i) }
}

可以看到类定义中的函数原型的 &self 参数展开后到了虚表中变成了 this: &Base，这是因为如果我们仍然使用 self 关键字，在虚表类中会被理解为是 BaseVTable 类型。实现中，我们用 func1_impl 作为实现的方法名，然后用 func1 产生虚表调用。
此前为了生成虚表，我们定义了宏 func1_type、func2_type 宏，返回函数的原型，实现函数时，也需要函数原型，于是我们想到重用之前的宏，如下：

#[macro_export] macro_rules! func1_type
{
    ($($name:ident $block:block)?) =>
    { fn $($name)? (this: &Base) -> i32 $($block)? };
}
#[macro_export] macro_rules! func2_type
{
    ($($name:ident $block:block)?) =>
    { fn $($name)? (this: &Base, i: i32) -> i32 $($block)? };
}

宏不传任何参数时，可用于生成虚表，传递函数名和代码块就可以用来生成函数实现，如下：

func1_type!(func1_impl { this.data.x });
func2_type!(func2_impl { this.data.y + i });

看上去很完美，我们用 cargo expand 展开也可以得到正确的结果。但是编译器有不同的意见，如下：

error[E0425]: cannot find value `this` in this scope
  --> class_impl/src/lib.rs:50:30
   |
50 |     func1_type!(func1_impl { this.data.x });
   |                              ^^^^ not found in this scope
error[E0425]: cannot find value `this` in this scope
  --> class_impl/src/lib.rs:52:30
   |
52 |     func2_type!(func2_impl { this.data.y + i });
   |                              ^^^^ not found in this scope
error[E0425]: cannot find value `i` in this scope
  --> class_impl/src/lib.rs:52:44
   |
52 |     func2_type!(func2_impl { this.data.y + i });
   |                                            ^ not found in this scope

这里因为由 Rust 宏生成的变量有一个隐形的作用域，宏生成的变量不会污染宏展开处的上下文。这是 Rust 宏和 C++ 宏最大的不同了。
以func2_impl 为例：

fn                               // 宏展开
func2_impl                       // 宏参数
(this: &Base, i: i32) -> i32     // 宏展开
{ this.data.y + i }              // 宏参数

因为函数原型中的变量 this 和 i 是宏展开所得到，而函数体中使用的变量 this 和 i 是由宏参数传递进来，作用域不同，所以此 this 非彼 this，此 i 也非彼 i。
这也是 Rust 安全性的体现，我们不用担心宏生成的变量会意外的覆盖了我们正在使用的变量，从而导致非预期的行为发生。这个行为在 Rust 中被称之为卫生性。
但是有时我们需要在宏中生成变量，就如我们的 func2_type 宏一样，我们希望它生成一个可以编译的函数，办法也是有的，就是显示捕获变量，如下：

macro_rules! func1_type
{
    () => { fn (this: &Base) -> i32 };
    ($name:ident $this:ident $block:block) =>
    { fn $name ($this: &Base) -> i32 $block };
}
macro_rules! func2_type
{
    () => { fn (this: &Base, i: i32) -> i32 };
    ($name:ident $this:ident $i:ident $block:block) =>
    { fn $name ($this: &Base, $i: i32) -> i32 $block };
}

为了生成完整的函数，我们添加了新的捕获变量，但是当我们生成函数原型时，我们并不需要捕获变量来使代码复杂化。这导致我们的宏规则要拆分为两条，现在我们可以通过下面的方法来生成完整的函数：

func1_type!(func1_impl this { this.data.x });
func2_type!(func2_impl this i { this.data.y + i });

宏卫生性是 Rust 安全性的体现，让我们可以编写更安全的宏。但有时也会带给我们一些困扰，好在 Rust 有解决办法。
但是，我们本来希望重用已有的宏来简化代码，现在看来反而更加复杂了。这与我们的初衷不符，我们还是在实现虚函数时老老实实的再次生成函数原型好了。但是当我们为派生类重写的方法生成函数原型时，遇到了问题。回顾一下，派生类 Derive2 重载了 func2 和 func3 两个方法：

#[class]
pub struct Derive2 : Derive1
{
    // 好像哪里不对？
    override fn func2(&self, s: &str) -> Vec<i32> { ... }
    override fn func3(&self, f: f64) -> (i32, &str) { ... }
}

那么这两个方法的函数原型中的 this 应该是什么类型呢？从直觉来讲，应该是跟随派生类的类型，如下：

// 虚表定义函数原型：
func2: fn(this: &Derive2, s: &str) -> Vec<i32>,
func3: fn(this: &Derive2, f: f64) -> (i32, &str),
// 函数实现：
fn func2_impl(this: &Derive2, s: &str) -> Vec<i32> { ... }
fn func3_impl(this: &Derive2, f: f64) -> (i32, &str) { ... }

这样实现有两个问题：

我们没有重载的 func1 应该是什么类型呢？如果也跟随 Derive2 的类型，那么就无法用 Derive1::VTABLE::func1 直接赋值，因为类型不同。为了虚表能够正常工作，我们要生成额外的代码，带来了不必要的开销；
如果重载的方法将函数原型写错了，如上，我们不仅无法发现问题，而且会生成可以正常编译代码。但运行时安全性被破坏了。

所以我决定虚函数的类型在它第一次定义时确定，也就是用 virtual 关键字标记时确定。那么 Derive2 的虚表和实现应该如下面的代码所示：

// 虚表定义函数原型：
func2: fn(this: &Base, i: i32) -> i32,
func3: fn(this: &Derive1) -> i32,
// 函数实现：
fn func2_impl(this: &Base, s: &str) -> Vec<i32> { ... }
fn func3_impl(this: &Derive1, f: f64) -> (i32, &str) { ... }

这时编译器应该很容易发现 func2_impl 和 func2 的类型不一致，从而拒绝编译，以此保障我们生成的代码在运行时的安全性。但是派生类并没有足够的信息来知道两个方法的 this 参数应该是什么类型，这也是我们希望重用 xxx_type 宏的原因，但是在刚刚的实践中，重用 xxx_type 并没有给我们生成代码带来便利，也会使得函数原型错误的问题难以发现。这次我们要换一个方式。我们只需要知道 this 的类型即可，如下：

macro_rules! func1_type
{
    () => { fn (this: &Base) -> i32 };
    (this) => { Base };
}
macro_rules! func2_type
{
    () => { fn (this: &Base, i: i32) -> i32 };
    (this) => { Base };
}
macro_rules! func3_type
{
    () => { fn (this: &Derive1) -> i32 };
    (this) => { Derive1 };
}

于是，我们这样实现重写的虚函数：

// 虚表定义函数原型：
func2: func2_type!(),
func3: func3_type!(),
// 函数实现：
fn func2_impl(this: &func2_type!(this), s: &str) -> Vec<i32> { ... }
fn func3_impl(this: &func3_type!(this), f: f64) -> (i32, &str) { ... }

因为那个粗心的程序员把函数原型写错了，编译器会提示类型不匹配。虽然基于宏展开代码错误信息的可读性不太友好，但总好过不报错。
程序员看到错误信息，修改了函数原型为正确的形式：

#[class]
pub struct Derive2 : Derive1
{
    w: i32,
    override fn func2(&self, i: i32) -> i32 { self.w + ... }
    override fn func3(&self) -> i32 { self.w + ... }
}

现在开始实现函数了：

fn func2_impl(this: &func2_type!(this), i: i32) -> i32
{
    this.data.w + ...
}
fn func3_impl(this: &func3_type!(this)) -> i32
{
    this.data.w + ...
}

我们刚刚解决了虚表初始化及函数原型不匹配的问题，新的问题又来了。
在重写虚方法实现中访问了 Derive2 的数据成员 w，在派生类中访问自己的数据成员本来不是什么问题，但正如我们上文所讲的，func2_impl 中的 this 类型是 &Base，而 func3_impl 中的 this 类型是 &Derive1，他们都无法访问 Derive2 的任何数据。
其实我们都知道上面的 this 都是 &Derive2 类型，只是为了函数原型兼容，才写成了基类的类型，那么这事就好办了。

fn func2_impl(this: &func2_type!(this), i: i32) -> i32
{
    let this: &Self = unsafe { reinterpret_cast(this) }; 
    this.data.w + ...
}
fn func3_impl(this: &func3_type!(this)) -> i32
{
    let this: &Self = unsafe { reinterpret_cast(this) }; 
    this.data.w + ...
}

我们之前实现了一个无条件强制类型转换的函数，用在这里再合适不过了。而且 reinterpret_cast 的转换是 0 开销的，也不担心有额外的代价。
好像函数实现还不太完整，下一节我们把 ... 的部分补上。

在 Rust 中模拟 C++ 类的功能宏卫生性第六

最新文章

最近回复

分类

归档

其它

2022年10月

在 Rust 中模拟 C++ 类的功能 宏卫生性第六

最新文章

最近回复

分类

归档

其它

在 Rust 中模拟 C++ 类的功能宏卫生性第六