<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Kotonia Articles</title>
    <link>https://kotonia.ai/en/articles/</link>
    <description>Latest technical articles from kotonia.ai — voice AI / multilingual TTS / local-GPU stack notes.</description>
    <language>en</language>
    <lastBuildDate>Wed, 27 May 2026 00:01:59 GMT</lastBuildDate>
    <atom:link href="https://kotonia.ai/en/articles/feed.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Why I Trained a LoRA for HiDream-O1-Image — The Unfiltered Story</title>
      <link>https://kotonia.ai/en/articles/hidream-o1-lora-why/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/13c642eb-f6ef-4374-9b2e-79b0555afaa8/en</guid>
      <pubDate>Tue, 26 May 2026 23:33:39 GMT</pubDate>
      <description>The unfiltered story behind the HiDream-O1-Image LoRA: why anime quality is O1&apos;s weak spot, how 191 hand-picked images trained a visual booster, NSFW controllability, and what my analytics revealed about conversion drivers.</description>
      <category>lora</category>
      <category>hidream</category>
      <category>imagegen</category>
    </item>
    <item>
      <title>Betting on the video niche the big labs walked away from — model A/B to making I2V the mainstay</title>
      <link>https://kotonia.ai/en/articles/residual-video-domain-i2v/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/f0d8ddbb-0c15-4726-8017-09161c49d570/en</guid>
      <pubDate>Tue, 26 May 2026 06:33:36 GMT</pubDate>
      <description>Is there niche demand in free creation with the guardrails off? A solo dev on one local GPU: from model A/B to making high-res I2V the mainstay.</description>
      <category>solo-dev</category>
      <category>generative-ai</category>
      <category>video-generation</category>
      <category>local-gpu</category>
      <category>i2v</category>
    </item>
    <item>
      <title>HiDream Raw Output Failed → Tried Dev-2604 → VRAM Math Killed It → Won with a Prompt Enhancer Instead</title>
      <link>https://kotonia.ai/en/articles/hidream-prompt-enhancer/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/f5b3fcbd-52b3-459d-979a-dd428d4a9af8/en</guid>
      <pubDate>Sat, 23 May 2026 06:57:54 GMT</pubDate>
      <description>How we discovered that HiDream-O1&apos;s raw outputs collapse on plain Japanese prompts, tried switching to Dev-2604, found the VRAM math impossible, and won by adding a Gemini Flash Lite prompt enhancer instead. Four non-obvious HiDream pitfalls documented via A/B benchmarking.</description>
      <category>hidream</category>
      <category>diffusion</category>
      <category>promptengineering</category>
      <category>gemini</category>
      <category>imagegeneration</category>
    </item>
    <item>
      <title>Turning a 1-Line Idea Into a 40-Second Short with a 10-Beat Local Video Pipeline</title>
      <link>https://kotonia.ai/en/articles/consent-dilemma-10beat-pipeline/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/bb1b2af1-a523-451d-9e4f-d031787a87d0/en</guid>
      <pubDate>Fri, 22 May 2026 11:11:32 GMT</pubDate>
      <description>Full pipeline: Gemma 4 31B expands a one-liner into a 10-beat script, HiDream generates images, LTX-2 I2V renders clips, and ffmpeg assembles everything — all on one local GPU in 25–30 min.</description>
      <category>python</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>gpu</category>
    </item>
    <item>
      <title>Building a Sarcastic AI English Tutor with Persona-as-Code and Gemini Audio Input for Pronunciation Correction</title>
      <link>https://kotonia.ai/en/articles/mesugaki-english-persona-gemini-audio/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/364754ac-800f-475a-af4d-758610977a52/en</guid>
      <pubDate>Fri, 22 May 2026 11:09:13 GMT</pubDate>
      <description>How I built a high-attitude AI English tutor using persona-as-code design, Qwen3-ASR for multilingual STT, and Gemini audio input for real pronunciation feedback — solo dev perspective.</description>
      <category>ai</category>
      <category>webdev</category>
      <category>typescript</category>
      <category>rust</category>
    </item>
    <item>
      <title>Running LTX-2.3 Alongside TTS on a Single 96GB GPU with a Cold-Start Architecture</title>
      <link>https://kotonia.ai/en/articles/ltx2-cold-start-vram-coexistence/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/96857773-b422-476e-9b24-6096f61ee7d5/en</guid>
      <pubDate>Fri, 22 May 2026 11:07:05 GMT</pubDate>
      <description>How to go from 86 GiB idle VRAM (instant OOM) to 0 GiB idle / 40 GiB peak by using a cold-start design for LTX-2.3 on one RTX Pro 6000 Blackwell.</description>
      <category>gpu</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>Cutting LTX-2 22B Peak VRAM by 40% with fp8_cast — and Why optimum-quanto Was a Trap</title>
      <link>https://kotonia.ai/en/articles/ltx2-22b-fp8-cast-quantization/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/8af9ff14-6784-442c-b66f-cd7985100bc5/en</guid>
      <pubDate>Fri, 22 May 2026 11:05:39 GMT</pubDate>
      <description>How fp8_cast reduced LTX-2 22B peak VRAM from 40 GiB to 24 GiB in cold-start mode, and why optimum-quanto silently breaks the transformer.</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>gpu</category>
      <category>python</category>
    </item>
    <item>
      <title>HiDream Skeleton Mode: Prompt Beats OpenPose Ref — 8 Patterns Benchmarked</title>
      <link>https://kotonia.ai/en/articles/hidream-skeleton-pose-prompt/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/23352312-01ee-4167-bd15-8b00192e67d3/en</guid>
      <pubDate>Fri, 22 May 2026 11:03:56 GMT</pubDate>
      <description>Benchmarking HiDream-O1-Image skeleton mode across 8 patterns reveals 3 counterintuitive findings about openpose refs, resolution drops, and shift values.</description>
      <category>ai</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>gpu</category>
    </item>
    <item>
      <title>Replicating a Language-Learning Comedy Short with Claude Code — Gemini as a Multimodal Sub-Agent</title>
      <link>https://kotonia.ai/en/articles/comedy-shorts-claude-gemini/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/3e8a3850-bd49-4779-b545-92cb6fed61f4/en</guid>
      <pubDate>Fri, 22 May 2026 11:01:06 GMT</pubDate>
      <description>Building a local GPU + Gemini 3.1 Pro hybrid pipeline that generates publishable comedy Shorts from a single line of text in under 60 seconds.</description>
      <category>ai</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Five Years Later, I Finally Have 96GB VRAM — What It Actually Unlocks for Agent Loops</title>
      <link>https://kotonia.ai/en/articles/96gb-vram-mma-and-5-years/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/ee5931ea-d81e-4dcd-a26b-5e84250754a1/en</guid>
      <pubDate>Fri, 22 May 2026 10:59:17 GMT</pubDate>
      <description>Not a GPU unboxing. A real look at what 96GB VRAM enables for multi-model agent pipelines — and where it still hits its limits.</description>
      <category>gpu</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>python</category>
    </item>
    <item>
      <title>HiDream-O1-Image 3–8x Faster: Benchmarking Steps, CFG, and Resolution</title>
      <link>https://kotonia.ai/en/articles/hidream-quality-speed-bench/</link>
      <guid isPermaLink="false">https://kotonia.ai/articles/_g/b2efb8c1-a7e4-40b4-ac67-a698c1c20b31/en</guid>
      <pubDate>Fri, 22 May 2026 10:46:34 GMT</pubDate>
      <description>Real-world timing benchmarks for HiDream-O1-Image Full — tuning steps, guidance scale, and resolution to speed up iteration without killing quality.</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>gpu</category>
      <category>python</category>
    </item>
  </channel>
</rss>
