Learning Rust: hash map lookup/insert pattern
In Suricata we’re experimenting with implementing app-layer parser in Rust. See Pierre Chifflier’s presentation at the last SuriCon: [ pdf].
The first experimental parsers will soon land in master.
So coming from a C world I often use a pattern like:
value = hash_lookup(hashtable, key)
if (!value) {
hash_insert(hashtable, key, somevalue);
}
Playing with Rust and it’s HashMap implementation I wanted to do something very similar. Look up a vector and update it with the new data if it exists, or create a new vector if not:
match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
Some(mut v) => {
v.extend(data);
},
None => {
let mut v = Vec::with_capacity(32768);
v.extend(data);
self.chunks.insert(self.cur_ooo_chunk_offset, v);
},
};
Not super compact but it looks sane to me. However, Rust’s borrow checker doesn’t accept it.
src/filetracker.rs:233:29: 233:40 error: cannot borrow `self.chunks` as mutable more than once at a time [E0499]
src/filetracker.rs:233 self.chunks.insert(self.cur_ooo_chunk_offset, v);
^~~~~~~~~~~
src/filetracker.rs:233:29: 233:40 help: run `rustc --explain E0499` to see a detailed explanation
src/filetracker.rs:224:27: 224:38 note: previous borrow of `self.chunks` occurs here; the mutable borrow prevents //subsequent moves, borrows, or modification of `self.chunks` until the borrow ends
src/filetracker.rs:224 match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
^~~~~~~~~~~
src/filetracker.rs:235:22: 235:22 note: previous borrow ends here
src/filetracker.rs:224 match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
...
src/filetracker.rs:235 };
^
error: aborting due to previous error
Rust has strict rules on taking references. There can be only one mutable reference at one time, or multiple immutable references.
The ‘match self.chunks.get_mut(&self.cur_ooo_chunk_offset)’ counts as one mutable reference. ‘self.chunks.insert(self.cur_ooo_chunk_offset, v)’ would be the second. Thus the error.
My naive way of working around it is this:
let found = match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
Some(mut v) => {
v.extend(data);
true
},
None => { false },
};
if !found {
let mut v = Vec::with_capacity(32768);
v.extend(data);
self.chunks.insert(self.cur_ooo_chunk_offset, v);
}
This is accepted by the compiler and works.
But I wasn’t quite happy yet, so I started looking for something better. I found this post on StackOverflow (where else?)
It turns there is a Rust pattern for this:
use std::collections::hash_map::Entry::{Occupied, Vacant};
let c = match self.chunks.entry(self.cur_ooo_chunk_offset) {
Vacant(entry) => entry.insert(Vec::with_capacity(32768)),
Occupied(entry) => entry.into_mut(),
};
c.extend(data);
Much better :)
It can even be done in a single line:
(*self.chunks.entry(self.cur_ooo_chunk_offset).or_insert(Vec::with_capacity(32768))).extend(data);
But personally I think this is getting too hard to read. But maybe I just need to grow into Rust syntax a bit more.