🦆

Navigation

🧑‍🦯

Do - Listen, Process, Act & Learn

Porting to Rust

says ⮞ I use 20x magnification when I code and debug. I use emoji to simplify logs for myself. If you can't handle my code style you can disable most of it on this website by toggling the button in the navbar. Shall duck continue?

Having already built a fully featured version in Bash, the hardest part, the architecture and logic was completely solved.
The only thing left was to port it to a faster language.
Surprisingly, even as a Rust noob, the migration was straightforward and painless.

The result?
The same powerful tool, now with a massive performance boost. This post won't rehash the features (covered in the Bash version's write-up).
Instead we'll briefly go over the file structure and then jump straight to one thing: the breathtaking speed of the Rust rewrite.


File Structure

The Rust code is written in a Nix string and is generated upon build time.
Pretty straightforward structure, that's easy to follow.
You will have the link to the source code on GitHub at the bottom of this page.

⮞ View File Structure

  cfg = config.yo;
  # 🦆 says ⮞ Statistical logging for failed commands
  statsDir = "/home/${config.this.user.me.name}/.local/share/yo/stats";
  failedCommandsLog = "${statsDir}/failed_commands.log";
  commandStatsDB = "${statsDir}/command_stats.json";
  
  # 🦆 says ⮞ grabbin’ all da scripts for ez listin'  
  scripts = config.yo.scripts; 
  scriptNames = builtins.attrNames scripts; # 🦆 says ⮞ just names - we never name one
  # 🦆 says ⮞ only scripts with known intentions
  scriptNamesWithIntents = builtins.filter (scriptName:
    let # 🦆 says ⮞ a intent iz kinda ..
      intent = generatedIntents.${scriptName};
      # 🦆 says ⮞ .. pointless if it haz no sentence data ..
      hasSentences = builtins.any (data: data ? sentences && data.sentences != []) intent.data;
    in # 🦆 says ⮞ .. so datz how we build da scriptz!
      builtins.hasAttr scriptName generatedIntents && hasSentences
  ) (builtins.attrNames scriptsWithVoice);


  # 🦆 says ⮞ only scripts with voice enabled and non-null voice config
  scriptsWithVoice = lib.filterAttrs (_: script: 
    script.voice != null && (script.voice.enabled or true)
  ) config.yo.scripts;
 
  # 🦆 says ⮞ generate intents
  generatedIntents = lib.mapAttrs (name: script: {
    priority = script.voice.priority or 3;
    data = [{
      inherit (script.voice) sentences lists;
    }];
  }) scriptsWithVoice;

  fuzzyFlatIndex = lib.flatten (lib.mapAttrsToList (scriptName: intent:
    lib.concatMap (data:
      lib.concatMap (sentence:
        map (expanded: {
          script = scriptName;
          sentence = expanded;
          signature = let
            words = lib.splitString " " (lib.toLower expanded);
            sorted = lib.sort (a: b: a < b) words;
          in builtins.concatStringsSep "|" sorted;
        }) (expandOptionalWords sentence)
      ) data.sentences
    ) intent.data
  ) (lib.mapAttrs (name: script: {
    priority = script.voice.priority or 3;
    data = [{
      inherit (script.voice) sentences lists;
    }];
  }) scriptsWithFuzzy));

  # 🦆 duck say ⮞ u like speed too? Rusty Speed inc
  do-rs = pkgs.writeText "do.rs" ''
    // 🦆 SCREAMS ⮞ 500x++ FASTER!!🚀
    // 🦆 ... 
  '';  
  cargoToml = pkgs.writeText "Cargo.toml" ''    
    [package]
    name = "yo_do"
    version = "0.2.0"
    edition = "2021"

    [dependencies]
    regex = "1.0"
    serde = { version = "1.0", features = ["derive"] }
    serde_json = "1.0"
    chrono = { version = "0.4", features = ["serde", "clock"] }
    rand = "0.8"
    tokio = { version = "1.0", features = ["full"] }
    tokio-tungstenite = "0.20"
    futures-util = "0.3"
  '';
  
in { # 🦆 says ⮞ YOOOOOOOOOOOOOOOOOO    
  yo.scripts = { # 🦆 says ⮞ quack quack quack quack quack.... qwack 
    # 🦆 says ⮞ GO RUST DO I CHOOSE u!!1
    do = {
      description = "Brain (do) is a Natural Language to Shell script translator that generates dynamic regex patterns at build time for defined yo.script sentences. At runtime it runs exact and fuzzy pattern matching with automatic parameter resolution and seamless execution";
      category = "🗣️ Voice"; # 🦆 says ⮞ duckgorize iz zmart wen u hab many scriptz i'd say!     
      aliases = [ "brain" ];
      autoStart = false;
      logLevel = "INFO";
      helpFooter = ''
        echo "[🦆🧠]"
        cat ${voiceSentencesHelpFile} 
      '';
      parameters = [ # 🦆 says ⮞ set your mosquitto user & password
        { name = "input"; description = "Text to translate"; optional = true; } 
        { name = "fuzzy"; type = "int"; description = "Minimum procentage for considering fuzzy matching sucessful. (1-100)"; default = 60; }
        { name = "dir"; description = "Directory path to compile in"; default = "/home/pungkula/do-rs"; optional = false; } 
        { name = "build"; type = "bool"; description = "Flag for building the Rust binary"; optional = true; default = false; }            
        { name = "realtime"; type = "bool"; description = "Run in real-time mode for voice assistant"; optional = true; default = false; } 
      ];
      code = ''
        set +u  
        ${cmdHelpers} # 🦆 says ⮞load required bash helper functions 
        FUZZY_THRESHOLD=$fuzzy
        YO_FUZZY_INDEX="${fuzzyIndexFlatFile}"
        text="$input"
        INTENT_FILE="${intentDataFile}"
        # 🦆 says ⮞ create the source filez yo 
        cat ${do-rs} > src/main.rs
        cat ${cargoToml} > Cargo.toml     
        ${pkgs.cargo}/bin/cargo generate-lockfile     
        ${pkgs.cargo}/bin/cargo build --release
      '';
    };    
  };
  # 🦆 says ⮞ SAFETY FIRST! 
  assertions = [
    {
      assertion = assertionCheckForConflictingSentences.assertion;
      message = assertionCheckForConflictingSentences.message;
    } # 🦆 says ⮞ the duck be stateless, the regex be law, and da shell... is my pond.    
  ];}# 🦆 say ⮞ nobody beat diz nlp nao says sir quack a lot NOBODY I SAY!
# 🦆 says ⮞ QuackHack-McBLindy out!  

Let's Try It Out!

⮞ View Test Run Output

🦆🏠  HOME via  via 🐍 v3.12.10 via 🦀 v1.86.0 
16:38:14 ❯ yo do "vad är det för dag"
[🦆📜] [16:38:36] ✅INFO✅ ⮞ [🦆🧠] 'vad är det för dag'
[🦆📜] ✅INFO✅ ⮞ MEMORY ADJUSTMENT: time: base=3 → adjusted=0 (uses=1, confirms=0, context=YES)
[🦆📜] ✅INFO✅ ⮞ MEMORY ADJUSTMENT: travel: base=3 → adjusted=3 (uses=0, confirms=0, context=NO)
🦆MEMORY:SCRIPT:time
🦆MEMORY:ARGS:
🦆MEMORY:SENTENCE:vad är det för dag
🦆MEMORY:TYPE:exact
   ┌─(yo-time)
   │🦆 qwack!? vad är det för dag
   └─🦆 says ⮞ no parameters yo
   └─⏰ do took 6.752782ms

🏆⮞ HOLY 🦆 THAT'S FAST!!1


New Bottleneck

Soo... That's roughly around 1500-500x faster.
If you would have told me two years back that my voice assistant's bottleneck would be the TTS generation, I don't think I would have believed you. This changes everything, and leaves me thinking I might have to take a good look at Text-To-Speech.


As far as I know - this beast is mow the world's fastest open source intent recognition solution available for voice assistants today.
Which of course makes me both proud and happy, but to summarize I would be lying if I said it hasn't been a bumpy ride.
I never thought one single project could keep my hands this busy for this long.


The Full Source

View source code on GitHub

Keep Reading

Developing a feature rich NLP with jq, Nix & Bash
Making Good Use of the Combinational Explosion Requires Top Tier Speed - Porting to Rust⮜🦆here u are
(WIP)Memory - Context-Aware & Self-Learning Voice Assistant
Sentence Validation - Rapid Fast Testing Framework


Comments on this blog post