Stochastic population model in Rust

2020 Dec 12 @travishinkelman.com

As I spent a little time learning F# over the last few months, I found that it wasn't holding my attention. My interest in F# was based on the idea that I could write more robust code (via static typing) than in R and that I could more easily turn that code into web or desktop applications. I still think that F# could be a valuable tool to add to my toolbox, but I encountered just enough friction that I wasn't having fun with it. My primary point of frustration is that so much material for learning F# assumes that you already know C# and .NET. Plus, the roll out of .NET 5 and F# 5 this fall, while exciting, creates a period of increased confusion for beginners.

Surprisingly, my reaction to finding F# harder to learn than expected pushed me in the direction of wanting to learn Rust, a language that is notoriously hard to learn. Basically, the logic is that learning a statically-typed language is a big enough leap for me that I might as well take it one step farther and try to develop a better low-level understanding of programming. My initial impression of Rust is very favorable. Exploring Rust has emphasized how important good documentation, tutorials, and tooling are to me. Rust excels in all of these areas.

As a first exercise in Rust, I decided to use the simple stochastic logistic population model described in this blog post that I've previously implemented in Racket and F#. Because the Rust documentation is so good, I won't say much about getting started with Rust. I've only tried installing and using Rust on Ubuntu 20.04 and I'm using VS Code with the rust-analyzer extension.

First, let's create a new project.

cargo new stochastic-logistic --bin

This creates a new directory called stochastic-logistic with this directory structure:

stochastic-logistic/Cargo.toml
stochastic-logistic/src/main.rs

Dependencies are declared in Cargo.toml. For this stochastic logistic example, we need to add a couple of dependencies to Cargo.toml to allow us to draw from a random normal distribution.

[dependencies]
rand = "0.7.3"
rand_distr = "0.3.0"

Replace the default code in main.rs with the following code:

use rand_distr::{Normal, Distribution};

fn main() {
    const YINIT: f64 = 1.0;                              // initial population size
    const R: f64 = 1.4;                                  // maximum population growth rate
    const K: f64 = 20.0;                                 // carrying capacity
    const THETASD: f64 = 0.1;                            // std dev to add noise

    // 0th element of args is program name
    let args: Vec<String> = std::env::args().collect();
    
    let reps = args[1].parse::<usize>().unwrap();
        
    // number of years of growth to simulate       
    let num_years = args[2].parse::<usize>().unwrap();  
    
    let mut results = Vec::with_capacity(reps);
    for _ in 0..reps {
        results.push(logmod(YINIT, R, K, THETASD, num_years));
    }
    println!("{:?}", results);
}

fn logmod(yinit: f64, r: f64, k: f64, thetasd: f64, t: usize) -> Vec<f64> {
    let mut ys = Vec::with_capacity(t);
    ys.push(yinit);
    for i in 1..t {
        let normal = Normal::new(0.0, thetasd).unwrap();
        let normal_draw = normal.sample(&mut rand::thread_rng());
        let ys_i = ys.get(i-1).unwrap() * 
        (r - r * (ys.get(i-1).unwrap() / k)) * normal_draw.exp();
        ys.push((ys_i * 100.0).round() / 100.0);
    }
    ys
}

Shortly, I will describe this program, but, first, we can build and run the program with cargo run 2 10. [Note, the Hello world program created by cargo new requires no command line arguments and can be run with cargo run.] This will run two simulations of 10 years each and produces this output:

[[1.0, 1.79, 2.46, 2.71, 3.35, 4.31, 4.59, 4.57, 4.32, 4.42], 
 [1.0, 1.28, 1.46, 1.8, 2.29, 2.43, 3.05, 3.54, 4.75, 5.2]]

Now let's go through the program. First, we declare a few constants. The compiler provides a warning if the names are not in all upper case (or UPPER_SNAKE_CASE), but the code will still compile. However, there is no type inference with constants. I've used the default type for floats of f64, but f32 would work in this case, too.

const YINIT: f64 = 1.0;                              // initial population size
const R: f64 = 1.4;                                  // maximum population growth rate
const K: f64 = 20.0;                                 // carrying capacity
const THETASD: f64 = 0.1;                            // std dev to add noise

The next part of the program handles arguments. We collect the arguments into a string vector. Both arguments are parsed from strings to unsigned integers (usize means that the integers are sized to the build target). unwrap is the quick-and-dirty way of telling the compiler that we are fine with the program crashing if the argument is not available or can't be parsed.

// 0th element of args is program name
let args: Vec<String> = std::env::args().collect();

let reps = args[1].parse::<usize>().unwrap();
    
// number of years of growth to simulate       
let num_years = args[2].parse::<usize>().unwrap();  

In the last part of main, we have a simple loop to repeat the logmod function reps times. I initially used an array to collect results, but the size of the array needs to be known at compile time, which means that you can't initialize an array with command line arguments. Later, I read that "you should probably just use Vec or HashMap. These two collections cover most use cases for generic data storage and processing." Because data structures are immutable by default in Rust, we have to declare that results is mutable with mut. The exclamation point in println! indicates that it is a macro.

let mut results = Vec::with_capacity(reps);
for _ in 0..reps {
    results.push(logmod(YINIT, R, K, THETASD, num_years));
}
println!("{:?}", results);

The last part of the code is the logmod function. We specify the types of all the input parameters and the return type (after the ->). The initial population abundance is pushed onto the ys vector and subsequent values are calculated based on the previous value in ys. Here, I used get(i-1). Alternatively, I could have used last(), which would require that for i in 1..t is changed to for _ in 1..t because the i is not used in last(). &mut is related to the notorious borrow checking in Rust, which I don't yet understand. I was just following the example provided for drawing random numbers.

fn logmod(yinit: f64, r: f64, k: f64, thetasd: f64, t: usize) -> Vec<f64> {
    let mut ys = Vec::with_capacity(t);
    ys.push(yinit);
    for i in 1..t {
        let normal = Normal::new(0.0, thetasd).unwrap();
        let normal_draw = normal.sample(&mut rand::thread_rng());
        let ys_i = ys.get(i-1).unwrap() * 
        (r - r * (ys.get(i-1).unwrap() / k)) * normal_draw.exp();
        ys.push((ys_i * 100.0).round() / 100.0);
    }
    ys
}

I know that I've hardly touched on any of the features of Rust, but this was a nice experience. I also really enjoyed the guessing game tutorial in the Rust book. I'm particularly interested in the potential for Rust in scientific computing and plan to try to rewrite a couple of small Fortran programs in Rust.