Meetre — meeting transcripts that never leave your Mac
I built an open-source Mac tool that records a meeting, transcribes it with Whisper, labels the speakers and summarises it with Qwen — all on-device, no cloud, no accounts.
I take a lot of meetings, and I never made peace with how I documented them. Half-listening while scribbling notes meant I missed the actual conversation. Handing the recording to a cloud service meant quietly accepting that everything said in that room now lived on someone else's server. So I built Meetre.
Meetre records a meeting on your Mac and turns it into a clean, speaker-labelled, summarised transcript. The whole point lives in one word: local. Nothing leaves your machine. No upload, no account, no terms of service deciding what happens to your words.
The models doing the work
The transcription runs on Whisper large-v3-turbo, accelerated through Apple's MLX framework so it actually flies on Apple Silicon. If you don't need the full model, you can drop down to small or medium to save disk and time.
The summaries come from a local LLM that reads the transcript and writes down what was actually decided. By default the model is set to auto, which just picks the biggest one that fits your RAM: an 8 GB MacBook lands on qwen3.5-4b, a 32 GB machine gets qwen3.5-35b, and you never have to think about it. The current generation is Qwen3.5, a hybrid reasoning model that thinks through the meeting before it writes a word, and Gemma 4 for the strongest multilingual output. That hidden reasoning gets stripped from what you keep, so what's left is four clean sections: summary, decisions, tasks, open questions.
Speaker separation is the part I'm quietly proud of. Meetre records your mic and the system audio as separate stems, so it can already tell you and anyone in your room apart from the people on the call before any model runs. Then pyannote diarization runs on each stem on its own to split the individual voices within each side. Diarizing clean single-source audio is far more reliable than untangling a mono mix after the fact, and the transcript reads like a script instead of a wall of text.
Every one of those models runs on your own hardware. The cloud never enters the picture.
The best privacy feature is the one you never have to think about. With Meetre the data simply never leaves the room.
Driving it from the terminal
Install is one line. It downloads a local, relocatable Python and sets up permissions for you:
cd ~ && curl -fsSL https://github.com/maxlkatze/meetre/archive/refs/heads/main.tar.gz | tar -xz && cd meetre-main && bash install.sh
After that, the CLI stays out of your way:
# Start recording with a name
meetre record --name "Standup"
# Re-transcribe an audio file you already have
meetre transcribe call.mp3
# Push the latest transcript into Apple Notes
meetre summarize
# View or change settings (model, language, toggles)
meetre config
There's also meetre cli if you'd rather pick from an interactive text menu than remember flags.
Or just use the menu bar
If the terminal isn't your thing, the ✦ icon sits in the menu bar and does everything the CLI does. Click it and you get a settings popup for the meeting name, the transcription and summary models, the language, plus toggles for system audio and speaker detection. The summary picker is sized for your machine: models too big for your RAM are greyed out, downloaded ones get a checkmark, and the reasoning models get a little brain. While it's working you see live status (recording time, a download bar, a spinning Transcribing…), and a native notification fires when the meeting's ready. It auto-updates with a git pull on every launch, can start at login, then stays invisible until you need it.
System audio gets captured straight through ScreenCaptureKit, so there's no BlackHole or loopback driver to install — the thing that usually turns "record a call" into a half-hour of audio-routing yak-shaving.
What comes out the other end is a Markdown transcript with timestamps and speaker labels, an MP3 backup of the audio, and a note in Apple Notes.
Why local, and why open
Privacy is the obvious reason, and it's real. The quieter one is ownership. A tool running on your own hardware stops being a subscription you rent and becomes something you keep. It works on a plane. It works when the service you depend on has an outage. It still works in five years when the startup behind the cloud alternative got acquired and shut down. Apple Silicon is fast enough now that sending audio to a server is a choice rather than a requirement, and I'd rather make the other one.
It runs on macOS 13 and up on Apple Silicon (M1 through M5), wants 8 GB of RAM at the low end and about 6 GB of disk for the models, and defaults to German because that's what most of my meetings are. Other languages and auto-detect are a toggle away. It's quick, too: a 30-minute meeting transcribes in two to four minutes on most M-series Macs, faster the newer the chip.
The whole thing is open source under the MIT license, so you can read exactly what it does with your audio before you trust it with a single meeting. It's on GitHub. I built it for myself first, which is usually the only honest reason to build anything.