{
  "script": [
    {
      "text": "This supposed casual stream is basically me showing how I tweaked version 4.0.0 of that crap audiobook maker tool.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "W-wait, why would you even show this stuff, like, in a normal video? It sounds way too nerdy.",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "Because Echo TTS is fine, sure, but the CC BY-NC licensing is a complete nightmare for real use, dammit.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "Oh geez, so if it's non-commercial, how can you actually do anything useful with it then?",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "I'm forcing a DAC VAE version of Echo TTS\u2014ripping out the CC handcuffs to make this thing actually profitable, you idiot.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "B-but wait, you're using emojis on the Japanese TTS model? How does that actually work with voice?",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "It\u2019s signal manipulation, Morty, but the point is the vibe; it\u2019s emotional packaging, not just spitting out dull dictionary words.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "Aw man, if you can make the voices feel stuff, why aren't you making flashcards that are, like, also emotional?",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "I used OCR and an LLM to make Japanese flashcards already, using PaddlePaddle OCR V5; it's a different kind of hell.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "S-so you're saying all this deep fine-tuning is just... just more work, even if it's better?",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "It\u2019s not just work; it\u2019s how you bypass limitations, you dense sack of protoplasm. Just another tedious thing.",
      "character": "Rick Sanchez",
      "characterAvatar": "characters/rick/rick.png",
      "artifact": "artifacts/square.png"
    },
    {
      "text": "I guess that\u2019s just it then. Watching you tinker with all this overly complicated junk.",
      "character": "Morty Smith",
      "characterAvatar": "characters/morty/morty.png",
      "artifact": "artifacts/square.png"
    }
  ]
}