← Back to Blog

Meet Jotley - Building a Native Apple Notes App with AI Transcription

January 30, 2026 • Protomota

Hey everyone šŸ‘‹

Meet Jotley — a native voice notes app for Mac and iOS with real-time AI transcription. This is my first "building in public" post, so let me catch you up on where things stand and why I'm taking the approach I am.

The One-Liner

Jotley is what Apple Notes would be if it could transcribe your voice, summarize it with AI, and let you chat with your notes. Record a thought, get a transcript and summary instantly. Ask questions about what you recorded. Syncs across all your devices.

šŸ‘‰ jotley.ai

Why Native SwiftUI (and why this matters)

This is probably the most opinionated decision I made early on, and I want to be honest about the trade-offs.

Why not Electron/React Native?

I've used plenty of Electron apps. Notion, Slack, Discord — they work, but you feel it. The memory usage, the slight input lag, the way they just don't integrate with the OS the way native apps do.

For a notes app that I want people to reach for 20+ times a day, native performance isn't optional. It's the product.

The honest trade-off: I'm targeting maybe 30% of the smartphone market (Apple users only). That's a real limitation. But I'd rather build something excellent for a smaller market than something mediocre for everyone.

But here's the real reason: this app simply wouldn't be possible with cross-platform frameworks.

When you're building a voice-first app that needs to:

  • Capture system audio on Mac (requires Apple's ScreenCaptureKit and ProcessTap APIs)
  • Handle real-time audio streaming with AVFoundation
  • Integrate deeply with iCloud sync via CloudKit
  • Feel like a first-class citizen on both platforms

...there's no React Native bridge for that. These are Apple-only SDKs that require native Swift code. I could've built a lesser version with cross-platform tools, but the core features that make Jotley special wouldn't exist.

The 60/40 code split

Here's something I'm proud of: ~60% of my codebase is shared between iOS and macOS. Views, models, services, sync logic — all shared. Platform-specific stuff (system audio capture on Mac, voice recording on iOS) lives in separate directories.

This wasn't automatic. It took deliberate architecture work. But now when I add a feature, I usually only write it once.

The Architecture (High Level)

Native SwiftUI apps talking to a lightweight backend that handles transcription and AI. The apps stream audio, the backend processes it, transcription and summaries come back in real-time.

Why a backend at all? API key security, usage tracking, and the ability to ship new AI features without waiting for App Store review.

The Development Journey

What took longer than expected:

  1. CloudKit sync — Getting conflict resolution right is hard. When you delete a note on Mac while your iPhone is offline, what happens when it comes back online? I went through three different sync strategies before landing on one that feels reliable.

  2. System audio capture on Mac — This requires special entitlements from Apple and involves newer APIs (ScreenCaptureKit, ProcessTap). Getting screen recording permission flow to feel native took iteration.

  3. Real-time transcription feel — The words need to appear smoothly as you speak. Too chunky feels broken. Too fast feels jittery. Finding the right buffering took trial and error.

What was easier than expected:

  1. SwiftUI in 2026 — It's actually mature now. Most of the jank from early versions is gone. I'm building features, not fighting the framework.

  2. Transcription quality — Modern speech-to-text is genuinely excellent. Fast and accurate out of the box.

Honest Self-Assessment

What I'm doing right:

  • Building something I actually use daily
  • Native performance is a real differentiator
  • The core experience (record → transcribe → summarize → chat) is tight

What I'm doing wrong:

  • Zero marketing execution
  • No content, no social presence, no audience
  • "Build it and they will come" is not a strategy

What's Next

I'm treating this month as the start of the real push:

  1. Record a demo video — 60-90 seconds showing the core flow
  2. Update App Store screenshots — Current ones don't show transcription
  3. Start posting on Indie Hackers — You're reading post #1
  4. Ship the payment integration — Can't be indie hacking at $0 MRR forever

Questions for the Community

  1. Pricing thoughts? I've got two tiers: Essential ($10/month or $79/year — works out to ~$6.50/month) and Unlimited ($20/month or $158/year — ~$13/month). Otter.ai charges $16.99/month. Am I positioned right?

  2. Product Hunt timing? Do I launch now with what I have, or wait until the demo video and updated assets are ready?

  3. What would make you try it? Genuinely curious what would get you to download a new notes app in 2026.


If you want to follow along: I'm @protomota on X/Twitter, and I'll be posting updates here regularly.

Sign up for early access at jotley.ai — we're launching soon!

Thanks for reading. Happy to answer any questions about SwiftUI, native dev, or indie dev life in general. šŸ™