the fine print

Everything lognote does, in one place.

The homepage covers the headline. This page covers the rest: every setting, every recovery path, every silent piece of machinery that makes the product work. Read it cover-to-cover or jump to a section.

basics

The basics

Recording, the status pill, what lands in your note.

Starting and stopping a recording

Three ways to toggle recording — mic ribbon, command palette, or the PATH wrappers.

The Obsidian plugin gives you a mic ribbon icon in the left sidebar (click to toggle), plus three command-palette entries: Lognote: Start recording, Lognote: Stop recording, and Lognote: Toggle recording. Assign keyboard shortcuts under Settings → Hotkeys → search “lognote” — there’s no default chord because we don’t want to step on whatever you’ve already bound.

Outside Obsidian, lognote-record-start <path-to-note.md> and lognote-record-stop (installed in ~/.local/bin/ by setup.sh) do the same thing. The PATH wrappers are how you’d wire lognote into any editor that runs shell commands, or just record from a terminal without an editor at all.

Status bar pill

A live recording indicator in Obsidian's status bar — click to stop.

While a recording is active, the plugin shows ● recording 02:14 in Obsidian’s bottom-right status bar. Click it to stop. The pill polls $LOGNOTE_STATE_DIR/recording-state every second and verifies the capture process is still alive via process.kill(pid, 0), so if the capture binary crashes or you kill it from a shell, the pill clears within a second — it never lies about state.

The pill only shows while Obsidian is in the foreground. For a global indicator that’s visible from anywhere, see the menu-bar-helper entry.

Menu bar helper

A global recording indicator that lives in macOS's menu bar.

LognoteMenuBar.app is a tiny AppKit helper (no special entitlements, no Dock icon) that surfaces recording state system-wide — handy because Obsidian’s status bar pill is only visible when Obsidian is in front. Idle shows a mic icon; while recording it shows a red record icon and elapsed time (● 02:14). Click the icon for a menu with Stop recording, Open transcription log, and Quit.

setup.sh builds it but doesn’t auto-launch — start it manually with open -a lognote-menubar-src/LognoteMenuBar.app. To start it at login, add it via System Settings → General → Login Items. It polls the same recording-state file the status bar pill does, with the same kill(pid, 0) liveness check, so the two indicators never disagree.

What lands in your note

A structured summary block followed by a collapsible full transcript.

When a recording finishes, lognote replaces the pending marker in your note with two things: a summary block (TL;DR, action items, decisions, open questions, topics discussed, notable quotes — sections are omitted, never left empty) and a <details>-wrapped full transcript that renders as click-to-expand. Below both, a small footer (_Summarized by **provider** (model)_) records which backend produced the summary, so you can spot-check what auto-mode picked.

If summarization is disabled (SUMMARIZE_ENABLED=0), the transcript still lands as a plain section with no summary block above it — see the skip-summaries entry.

If summarization is enabled but fails (bad API key, provider unreachable, model returns nothing), the transcript still lands wrapped in the same <details> block, but the summary section is replaced with a visible ⚠️ failure block showing the provider, error class, and a pointer to the retry command. The <audio>.summary.failed sidecar is written alongside so the plugin’s scan and retry commands can find it later. See retry-failed-summary for how to recover.

me / others speaker labels

Mic audio is labeled "me"; system audio is labeled "others".

The two-track recording structure gives you free 2-label diarization: anything captured through the mic is tagged me, anything captured from system audio (Zoom, Teams, browser, etc.) is tagged others. Consecutive segments from the same speaker get grouped under one block in the transcript, with timestamps in _[mm:ss]_ format.

This is a v1 limitation worth knowing: every remote participant in a Zoom call gets pooled under others. Per-individual labeling (separating remote speakers from each other) is on the roadmap, not in the product today. If you need to attribute a specific quote to a specific person, the timestamp + your memory is the current workflow.

Marker-based insertion

An invisible HTML comment reserves the spot where your transcript will land.

When you start recording, lognote writes a unique  HTML comment at your cursor. The comment is invisible in preview but the post-transcribe script uses it to find the exact insertion point — so you can scroll, edit, navigate to other notes, even quit Obsidian, and the transcript still lands where you originally hit record.

If you’ve deleted the marker by the time transcription finishes, the transcript appends to the end of the note instead. If you’ve deleted the note itself, it falls back to inbox/ — see the inbox-fallback entry.

Frontmatter and metadata

What headers and links lognote adds to a generated note.

For live recordings, lognote doesn’t touch your note’s frontmatter — it inserts at the marker and leaves everything above and below alone. The transcript block opens with a ## 🎙️ Transcript heading and an _Audio: [[<absolute-path-to-m4a>]]_ wikilink so you can click through to the source audio at any time. The summary footer records the provider and model name.

Notes produced by lognote-import and lognote-resplit are different: they create new notes from scratch and write YAML frontmatter at the top (title, source / source_audio, source_range, imported_at, generated_by) so you can tell at a glance where the note came from and which audio it traces back to.

Inbox fallback

When the target note is gone, transcripts land in inbox/ instead of being lost.

If you delete (or rename, or otherwise lose track of) the note between hitting record and transcription finishing, lognote drops the result in $LOGNOTE_NOTES_DIR/inbox/transcript-<timestamp>.md instead of failing the run. A macOS notification fires so you know it landed there rather than where you originally pointed it.

The inbox directory is created on demand, so you don’t need to set it up. If you find yourself with a backlog there, it usually means a workflow problem worth fixing — but the work product is never lost.

Your lognote (usage stats)

A glanceable card of fun stats — notes, time, words, and your talk-ratio — computed locally, never sent anywhere.

The companion app’s General tab carries a Your lognote card: a small set of stats about how you’ve used the app. It’s there to glance at, not to act on. There are no buttons, no goals, no nudges — just numbers that add up as you record.

Everything on the card is computed on your Mac from a local usage ledger. Nothing about it leaves the device, and lognote never phones home to assemble it. This is consistent with the audio-stays-on-your-Mac promise: the stats are derived from your own recordings, on your own machine.

What’s on the card

Notes recorded — all-time, plus a this-week count.
Total time captured — the summed length of every recording.
Words captured — the running word count across your transcripts.
Talk-ratio — “you spoke X% of the time.” Because lognote records your mic and the system audio as two separate tracks, it already knows your share of each conversation versus everyone else’s. When your share is low, the card says so gently (“mostly listening 👂”). It’s an observation, not a score.
Average session length and longest session — the typical recording, and your single longest one.
Busiest day of the week — a plain observation of when you capture most (for example, “you capture most on Tuesdays”).
Streak — your current and best run of days with at least one recording. It’s shown for interest only; the card never frames it as something to maintain or warns you about losing it.
Capturing since — the date your first recording landed under this build.

A few honest notes

The card starts fresh from this release. It doesn’t backfill recordings you made before it shipped, so the numbers grow from here rather than reflecting your whole history with lognote.

There’s no “time saved” or productivity figure. lognote doesn’t estimate what typing the same notes would have cost you, because any such number would be made up. The card sticks to what it can actually count.

The talk-ratio is the one stat most tools can’t show, and it falls out of the two-track capture for free — no extra processing, no upload, no separate model. It’s a small reflection of how your meetings actually go.

settings

Settings & config

Every knob you can turn, and where your secrets live.

Settings

Configure lognote in the companion app. ~/.config/lognote/env is the single source of truth.

lognote settings live in the Lognote companion app, not in the Obsidian plugin. Open the app (it lives wherever you installed it — typically your Applications folder) after the setup walkthrough to manage settings.

The Obsidian plugin is a thin recording trigger: it passes recording commands to the bash layer and inserts the transcript marker into your active note. Configuration — provider, API keys, transcription model, retention — all comes from the app, which writes ~/.config/lognote/env.

Layout

After the walkthrough, the app’s home window is the settings manager. It is organized into four tabs:

General — setup status, walkthrough controls, updates, build info
Transcription — WhisperKit model, language
Summarization — provider selection, credentials, auto-order, “Test connection”
Recordings — audio retention, auto-stop on silence

General tab

Setup status — compact health summary of each setup step (permissions, model warmup, plugin install). Loads quietly on every app open; no animation runs unless you explicitly request a re-check.
Re-check permissions — runs the test capture in place (the same ~8-second capture the walkthrough uses) without requiring a full walkthrough re-run. Use this after adjusting audio routing or system-audio settings to confirm both tracks are live.
Re-run walkthrough — resets wizard progress so you can step through the guided setup again. Settings, models, and TCC grants are untouched.
Updates — an Automatically check for updates toggle (on by default — Lognote checks daily in the background and prompts you to install when a new version is found) and a Check for Updates… button to check right now. See Automatic updates for how the signed over-the-air update flow works.
Version footer — current build info.

The vault is established during the walkthrough and is not editable here; it is where the plugin is installed and where notes land.

Saving any tab shows a brief “Settings saved” confirmation.

Transcription tab

Transcription model (WHISPERKIT_MODEL) — friendly dropdown: Large v3 Turbo (recommended) plus other WhisperKit models, and a “Custom…” option that accepts a free-text model name. Default is Large v3 Turbo (large-v3-v20240930_turbo). Larger models are more accurate but slower; the default is a good balance.
Language (LANGUAGE) — spoken-language hint for WhisperKit. Default: Auto-detect (recommended) — WhisperKit infers the language from the first few seconds of audio. To force a specific language, choose from the curated list (English, Spanish, French, German, Italian, Portuguese, Dutch, Japanese, Chinese, Korean) or type an ISO 639-1 code in the Custom field. Transcription handles all languages Whisper supports; summary quality depends on the chosen provider and model.

Summarization tab

Summaries enabled (SUMMARIZE_ENABLED) — turn off to skip the LLM step entirely; the transcript still lands in your note.
Provider (LLM_PROVIDER) — pick the LLM backend: auto, openai, anthropic, azure, ollama, or local-mlx. The credential fields below change to match the selected provider; values for all providers are kept on disk so switching back doesn’t lose them.

Each cloud/network provider section includes a “Test connection” button. It validates the current (unsaved) form values by running a minimal live request (lognote-engine check-provider <provider>) and immediately shows the outcome — “Connected” or a specific error pointing at the exact problem (e.g. “Invalid API key (401)”, “Model not found”, “Endpoint unreachable”). No need to run a full recording to find out a key is wrong.

Provider credential fields:

OpenAI — OPENAI_API_KEY and optional OPENAI_MODEL (default gpt-4o-mini).
Anthropic — ANTHROPIC_API_KEY and optional ANTHROPIC_MODEL (default claude-haiku-4-5).
Azure OpenAI — AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_DEPLOYMENT, optional AZURE_OPENAI_API_VERSION.
Ollama — OLLAMA_ENDPOINT and optional OLLAMA_MODEL.
Local model override (LLM_MODEL_OVERRIDE) — pin a specific HuggingFace repo for local-mlx summarization, bypassing the auto-switch by transcript length. See model-override.

Auto-order

When provider is set to auto, the tab shows a drag-reorderable list of providers. Drag to define the order auto probes them; the order is written as LLM_AUTO_ORDER (comma-separated provider ids, e.g. openai,anthropic,azure,ollama,local-mlx). Only configured providers are actually tried, regardless of their position in the list. See provider-auto-detection for the default probing order and fallback behavior.

Recordings tab

Audio retention days (LOGNOTE_AUDIO_RETENTION_DAYS) — how long raw .m4a files are kept before opportunistic cleanup at the next record-start. Default 90. Set to 0 to disable cleanup.
Auto-stop on silence (AUTO_STOP_SILENCE_SECONDS) — a minutes stepper with an on/off switch. When both the mic and system-audio tracks go quiet for this long, the recording stops itself and transcription runs normally. Default 5 minutes; turn the switch off to keep recording until you stop manually (written as 0). A warn-me field (SILENCE_WARN_BEFORE_SECONDS, default 60s) sets how many seconds before the auto-stop you get a heads-up notification. Changes apply to your next recording. The lower-level knobs — silence threshold (SILENCE_THRESHOLD_DB) and initial grace period (SILENCE_INITIAL_GRACE_SECONDS) — remain power-user env; edit them directly in ~/.config/lognote/env. See auto-stop-on-silence.

How settings are stored

Save writes all configured fields to ~/.config/lognote/env. The runtime reads this file (via lib/lib.sh) as the authoritative config source. Sensitive values — API keys — live in this file (mode 0600), outside any vault or git repo.

Power-user / operational tuning keys are not surfaced in the app UI — they stay file-only knobs you edit in ~/.config/lognote/env directly: silence-watcher thresholds (SILENCE_THRESHOLD_DB, SILENCE_INITIAL_GRACE_SECONDS), dedup/VAD (LOGNOTE_DEDUP_*, LOGNOTE_MIC_VAD_*), max recording length (LOGNOTE_MAX_RECORDING_SECONDS), audio directory (LOGNOTE_AUDIO_DIR), state directory (LOGNOTE_STATE_DIR). See env.example in the repo for documented defaults. The app preserves these lines untouched on every Save. (The auto-stop duration and warn-me lead-time — AUTO_STOP_SILENCE_SECONDS / SILENCE_WARN_BEFORE_SECONDS — are now managed by the Recordings tab, not preserved-untouched file-only keys.)

The Obsidian plugin settings tab

The plugin’s Settings → lognote tab now shows only detected paths (repo path and vault dir, read-only) and a link to the Lognote app for configuration. Recording, the mic ribbon, the status bar pill, and “Retry failed summaries” all work the same as before.

providers

AI providers

Your choice. Local-mlx by default; OpenAI, Anthropic, Azure, Ollama if you want them.

Provider auto-detection

With LLM_PROVIDER=auto, lognote probes providers in priority order and uses the first one configured.

LLM_PROVIDER=auto (the default) probes the available providers in order and uses the first one that’s configured. The default order is: OpenAI → Anthropic → Azure → Ollama → local-mlx. Cloud providers check for their respective API keys; Ollama checks its local endpoint. local-mlx sits at the end as the guaranteed last resort — auto falls back to it even when nothing else is configured, because the engine would rather try the on-device path and fail loudly than report “no provider available.” That readiness check looks only at the default 3B (or an LLM_MODEL_OVERRIDE model, if set); the larger 8B/14B models for long transcripts are selected later, at summarize time, by transcript length.

The default order reflects a simple bet: if you’ve paid for a cloud key, you probably want it used; if you’re running Ollama locally, that’s faster than spinning up MLX; local-mlx is the always-available free fallback. You can override the auto-pick at any time by selecting a specific provider in the Lognote app’s Summarization tab (or by setting LLM_PROVIDER directly in ~/.config/lognote/env) — the dispatcher always honors an explicit choice.

Customizing the auto order

If you want auto to try a different priority — for example, Anthropic before OpenAI, or local-mlx before any cloud provider — set LLM_AUTO_ORDER to a comma-separated list of provider ids:

LLM_AUTO_ORDER=anthropic,openai,azure,ollama,local-mlx

The engine skips any provider in the list that isn’t configured (no key, no endpoint), so you can list all five and the order only matters for the ones you’ve actually set up. LLM_AUTO_ORDER is ignored when LLM_PROVIDER is set to a specific provider.

In the Lognote app, this is exposed as a drag-reorderable list in the Summarization tab when provider is set to auto. Dragging reorders the list and saves LLM_AUTO_ORDER on the next Save. Leave LLM_AUTO_ORDER unset (the default) to use the engine’s built-in order shown above — fully back-compatible with existing installs.

Local MLX (on-device summaries)

On-device summarization in the native engine — free, private, no signup, no network at summary time.

The local-mlx provider runs entirely on-device, natively inside lognote-engine via mlx-swift (Apple’s MLX framework). No API key, no per-summary network round-trip, no cost — and the audio-stays-local promise extends to the summary step too. (The only network touch is a one-time model download: the default 3B during setup, and the larger models once each on first need.) setup.sh pre-downloads the default 3B model (mlx-community/Llama-3.2-3B-Instruct-4bit, ~2 GB) during install so the first auto-detect hit doesn’t stall on a download. Pass --skip-llm-download if you’d rather defer that.

The model auto-switches by transcript length, capped by your Mac’s memory. Longer transcripts prefer a larger model (the 3B already on disk, then Llama-3.1-8B at ~5 GB, then Qwen2.5-14B at ~8 GB, the larger two lazy-downloaded on first need), and lognote picks the largest one that fits your Mac’s RAM (roughly 16 GB for the 8B, 32 GB for the 14B), always falling back to the 3B. So a long meeting on a 16 GB Mac is summarized by the 8B rather than the 14B: you always get a summary, and the choice stays within the machine’s headroom. When a larger model needs downloading, the one-time pause is reported on stderr and in the import progress line so you understand the wait. To skip the auto-switch and pin a specific model, see the model-override entry.

Cloud providers (OpenAI, Anthropic, Azure, Ollama)

BYOK for the four supported cloud / network backends.

If you have a cloud LLM account, lognote will use it. All four backends are BYOK — lognote doesn’t proxy through any service of its own. Put credentials into the Lognote app (Summarization tab) or directly into ~/.config/lognote/env:

OpenAI — OPENAI_API_KEY, optional OPENAI_MODEL (default gpt-4o-mini). The general-purpose default; fast and cheap.
Anthropic — ANTHROPIC_API_KEY, optional ANTHROPIC_MODEL (default claude-haiku-4-5). Best summaries we’ve measured, slightly slower.
Azure OpenAI — AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_DEPLOYMENT, optional AZURE_OPENAI_API_VERSION (default 2024-10-21). For users on Azure tenants.
Ollama — OLLAMA_ENDPOINT (default http://localhost:11434), optional OLLAMA_MODEL. Local server, OpenAI-compatible API. If you don’t set OLLAMA_MODEL, lognote picks the first model ollama list returns and logs that choice on stderr.

The system prompt is fixed across providers (in the native lognote-engine summarize path) so the summary shape is the same no matter which backend ran — only the model’s interpretation differs. All four of these backends run natively in the engine.

Test connection

Each provider section in the app has a “Test connection” button. It validates the credentials currently entered in the form (before saving) by running lognote-engine check-provider <provider> — a minimal live request that exercises key authentication, the model name, and the endpoint together. The result is either “Connected” or a specific, human-readable error: “Invalid API key (401)”, “Model not found”, “Endpoint unreachable”, etc. This lets you confirm credentials work without waiting for a full recording to process.

The same engine verb (lognote-engine check-provider <provider>) is available from the command line for scripting or troubleshooting. It reads provider config from env, prints structured JSON to stdout ({"ok":true} or {"ok":false,"error_class":"auth|model|endpoint|network|other","message":"…"}), and always exits 0.

Pinning a specific local model

LLM_MODEL_OVERRIDE pins a HuggingFace repo for local-mlx, bypassing the auto-switch.

By default, local-mlx picks its model by transcript length (3B / 8B / 14B buckets). To pin a single model regardless of meeting size, set LLM_MODEL_OVERRIDE to any compatible HuggingFace repo id — either through the Local model override field in Settings, or by adding export LLM_MODEL_OVERRIDE=... to ~/.config/lognote/env.

Useful when you’ve benchmarked a specific quant and want consistent output, when you’ve already downloaded a bigger model and want it used for everything, or when you want to test a non-default model family (Mistral, Qwen variants, etc.) without modifying the source.

Skipping summaries entirely

Turn the summarize step off when you only want a raw transcript.

Flip Summarize on/off in Settings (or set SUMMARIZE_ENABLED=0 in ~/.config/lognote/env) and lognote skips the LLM call entirely. The transcript still lands in your note, just without the structured summary block above it and without the collapsed <details> wrapper.

Reasons to do this: you process transcripts elsewhere, you don’t trust LLM summaries for the kind of conversation you’re recording, you’re saving cloud-API cost, or you’re on a machine where local-mlx is too slow and you don’t want the wait. Summarization can be re-enabled at any time; it’s a runtime flag, not a build-time decision.

recovery

When meetings go sideways

Runaway recordings, interrupted fragments, failed summaries, orphan markers, all recoverable.

Recording joiner

Stitches interrupted .m4a fragments back into one dual-track file.

Sometimes a recording gets interrupted — force quit, reboot, power loss — and you end up with a couple of .m4a fragments instead of one file. bin/lognote-join <fragment1> <fragment2> ... stitches them back into one dual-track recording and runs the normal transcribe → summarize → land-in-your-vault pipeline on the result. You don’t lose the meeting; you just get one note instead of three.

Pass --target-note <path> to land the joined transcript into a specific existing note (replacing a TRANSCRIPT_PENDING_JOINED marker, or appending if none); without it, the result goes to inbox/. The Obsidian plugin’s Lognote: Join recordings… command wraps the same engine with a UI for picking fragments visually.

Retry a failed summary

When the LLM call fails, the transcript still lands and a sidecar records the error for later retry.

Summarization is treated as best-effort — if the provider is unreachable, the key is wrong, or the model returns nothing, the transcript still gets inserted into your note. The failure is recorded in a <audio>.summary.failed JSON sidecar next to the .m4a, with the error class, error message, target note, and marker id.

bin/lognote-retry-summary --list shows pending failures (one stanza per sidecar so you can eyeball which need credential fixes versus network retries). bin/lognote-retry-summary --run retries each: it regenerates the transcript markdown from the existing .transcript.json sidecar (no re-transcription), runs lognote-engine summarize against it, and patches the original note in place — replacing the ⚠️ summary-failed block with a real summary + collapsed-transcript layout. Successful retries delete their sidecar. The Obsidian plugin gives you two commands for the same loop without leaving the editor: Lognote: Show failed summaries (no retry) lists what’s pending, and Lognote: Retry failed summaries runs the CLI from inside Obsidian and surfaces the result via Notice.

Orphan marker recovery

Reconciles abandoned TRANSCRIPT_PENDING markers with the matching audio file.

If the transcription pipeline is killed mid-flight (reboot, force-quit, OOM, kill -9), the  marker stays embedded in your note with nothing to replace it. bin/lognote-recover scans your vault for those orphans and matches each to the most likely audio file in $LOGNOTE_AUDIO_DIR by comparing the marker’s embedded timestamp to the audio’s filename timestamp, within ±60s tolerance (override via LOGNOTE_RECOVER_TOLERANCE_SEC).

bin/lognote-recover (or --list) shows the matches without doing anything; bin/lognote-recover --run re-invokes the transcription pipeline for each matched orphan. The tool never auto-runs — --run is always explicit. The Obsidian plugin’s Lognote: Scan for orphan recording markers command surfaces the same list.

External transcript import

Run Zoom/Teams/Otter/VTT/SRT/plain transcripts through the same summarize-and-land pipeline.

If someone else recorded the meeting — Zoom’s auto-transcript, Teams, Otter, a copy-pasted plain-text transcript, a VTT/SRT subtitle file — lognote runs it through the same format → summarize → land-in-your-vault flow that a live recording would use. Format is auto-detected; override it (vtt, srt, otter, plain, json) if the sniffer misfires (rare, but possible with copy-pasted text that happens to look JSON-ish).

Import is built into the native engine, so it works in the distributed app — no Python, no dev setup. The Obsidian plugin’s Lognote: Import external transcript… command opens a modal: paste a transcript or pick a file, choose a format, and pick where it lands — inbox/, a new note at a custom path, or replacing the first  in an existing note (appends to EOF if no marker is found). The optional “copy original into <vault>/_archive/” toggle preserves the source file with cross-link frontmatter so it survives even if you later delete the produced note. Import runs in the background: clicking Import closes the dialog right away and the work continues while you keep working or start another recording. A status-bar pill (⬆ importing…) shows it running, and when the note lands you get a notice you can click to open it, plus a macOS banner if you have switched away from Obsidian. A long transcript is summarized by the best on-device model your Mac can run (see local-mlx), and the first import that needs a larger model downloads it once. Only one import runs at a time.

For scripting, the same flow is a one-liner: lognote-import <file> lands in inbox/; --format, --target-note <path> / --marker <id>, --output <path>, --title, and --archive-original mirror the modal’s options. The CLI and the plugin both drive the engine’s import verb, so the result is identical either way.

Audio file import

Point lognote at an existing audio recording (Voice Memo, Zoom/Teams export) and run it through the full transcribe → summarize → land pipeline.

Sometimes the recording already exists — a Voice Memo you dictated on your phone, a Zoom or Teams recording someone shared, an .mp3/.wav/.m4a from another tool. Audio file import runs that file through the exact same transcribe → format → summarize → land-in-your-vault pipeline a live lognote recording uses, without you having to re-record it live. It’s the audio sibling of external transcript import: that one takes an already-transcribed file, this one takes raw audio and transcribes it first with the same on-device WhisperKit model.

Supported formats. The picker lists the common ones — .m4a, .mp3, .wav, .aac, .m4b, .caf, .aiff, and .flac. That list is a convenience filter, not a hard limit: decoding goes through the same native macOS audio stack (AVFoundation) a live recording uses — the app ships no ffmpeg — so in practice any audio file macOS can decode will work. Pick “All files” in the dialog (or pass any path on the CLI) and it’ll transcribe if it’s decodable.

The Obsidian plugin’s Lognote: Import audio file… command opens a modal: choose an audio file, optionally set a title, and pick where the note lands — inbox/, a new note at a custom path, or appended to the end of an existing note. The optional “copy the original audio into <vault>/_archive/” toggle preserves the source file so it survives even if you later delete the produced note. Import runs in the background: clicking Import closes the dialog right away and the transcription continues while you keep working. When the note lands you get a notice you can click to open it, plus a macOS banner if you have switched away from Obsidian. The transcript is summarized by the best on-device model your Mac can run (see local-mlx), and only one import runs at a time.

One thing to know about speaker labels. A live lognote recording captures your microphone and the system audio on two separate tracks, which is what lets it split the transcript into “me” vs. “others” (see me/others speaker labels). An imported audio file is almost always a single mixed track — everyone’s voice is already blended together — so the transcript isn’t speaker-split. The imported note carries a short callout saying so, so it’s never a surprise. (A rare export that happens to carry two separate audio tracks, like some Zoom recordings, does get the split automatically.)

For scripting, the same flow is a one-liner: lognote-import-audio <audio-file> lands in inbox/; --output <path>, --title, and --archive-original mirror the modal’s options, and --target-note <path> appends to an existing note. The CLI additionally exposes --marker <id> to replace a specific  in that note (true marker replacement — the modal only appends). The CLI and the plugin both drive the engine’s import audio-apply verb, so the result is identical either way. Like transcript import, this is built into the native engine — it works in the distributed app with no Python and no dev setup.

capture

Capture behaviors to expect

The quiet machinery: silence watcher, clamshell guard, device-switching, disk-full.

Auto-stop on silence

Recording stops itself after a configurable stretch of silence on both tracks.

A silence watcher tails the audio capture log and tracks the last “loud” sample on each track. When both the mic and system-audio tracks have been silent for AUTO_STOP_SILENCE_SECONDS (default 300s / 5 min, set to 0 to disable), it fires record-stop and transcription kicks in normally. You get a macOS notification SILENCE_WARN_BEFORE_SECONDS (default 60s) before the auto-stop (“make a sound to keep recording”) so you can intervene if you’re just thinking.

You can change the duration — or turn auto-stop off entirely — from the Lognote app under Settings → Recordings, right below audio retention: a minutes stepper with an on/off switch, plus the warning lead-time. The app writes AUTO_STOP_SILENCE_SECONDS (and SILENCE_WARN_BEFORE_SECONDS) to ~/.config/lognote/env, and the change takes effect on your next recording. The two lower-level knobs — silence threshold (SILENCE_THRESHOLD_DB) and initial grace period (SILENCE_INITIAL_GRACE_SECONDS) — stay power-user env only; edit them in ~/.config/lognote/env (see lib/config.sh).

The watcher handles wake-from-sleep correctly: if the Mac suspends mid-recording, the audio binary stops emitting ticks while wall-clock keeps advancing — naively diffing now vs. last-loud would fire auto-stop on wake. Instead, any gap of 60s+ between ticks is treated as a wake event and the silence countdown restarts from zero.

Max recording length

Optional hard wall-clock ceiling that auto-stops a recording before very long sessions can exhaust memory.

LOGNOTE_MAX_RECORDING_SECONDS puts a hard wall-clock ceiling on a single recording. When the limit is reached the recording auto-stops exactly like the silence auto-stop — transcription runs and the transcript lands in your note as usual. You get a macOS notification a short lead time before the stop (reusing SILENCE_WARN_BEFORE_SECONDS) so it’s never a surprise.

It ships disabled (0) and is opt-in. The native transcribe path holds the fully decoded audio in memory, so multi-hour recordings can run a memory-constrained Mac (8 GB) out of memory. A silent global cap would cut legitimate long meetings short far more often than it would save anyone, so it’s off by default and meant as a safety rail for people who knowingly run long sessions on constrained hardware. If you enable it, pick a value with headroom below the memory danger zone — roughly 45 minutes (2700) is a defensible backstop on an 8 GB machine; avoid ~90 minutes, which already sits inside the risk zone.

The cap is enforced by an independent wall-clock timer inside the silence watcher (a separate poll, not the log-driven silence loop), so it still fires even if the recorder stalls and stops emitting audio activity. Before stopping, it re-checks that the live recording is still the one it was started for, so a leftover timer can never stop an unrelated later recording. Because it lives in the watcher, it protects every front end equally — CLI wrappers, the Obsidian plugin, and the menu bar helper — once enabled. Set it via ~/.config/lognote/env or the plugin’s spawn environment; see lib/config.sh.

Clamshell guard

Refuses to start when the lid is closed and built-in speakers are the default output.

There’s a macOS quirk where the CoreAudio process tap captures only zeros when the default output device is the built-in MacBook speakers and the lid is closed. Audio is still audible (mirrored to the external display’s speakers or wherever), but the tap sees silence — so you’d record a full meeting of mic-only audio and only discover the missing system track at transcript time.

record-start detects this combination at launch (ioreg AppleClamshellState + the Swift binary’s --current-output inspection) and refuses to start, printing the available alternative outputs in the same message. Switch the output device via Control Center → Sound, or open the lid, and try again. The check costs nothing when the lid is open — no recording session is set up to perform it.

Output device switching mid-recording

Pairing AirPods or changing output devices mid-meeting doesn't break capture.

The aggregate device that powers system-audio capture is bound to a specific clock master (the current default output device). If you change outputs mid-recording — pair AirPods, plug in a USB DAC, switch to an external display’s speakers — the original clock master goes away and audio would drop on the floor.

The Swift binary watches kAudioHardwarePropertyDefaultOutputDevice and, on a change, tears down the aggregate + IOProc and rebuilds them against the new clock master without interrupting the session. The same process tap is reused, so no audio is lost across the transition. From your perspective, the recording just keeps going.

Disk-full notification

A macOS notification fires when AVAssetWriter starts dropping samples — usually because the disk filled up.

The Swift capture binary watches AVAssetWriter.append() for failures during a recording. After sustained losses (typical cause: $LOGNOTE_AUDIO_DIR ran out of space) it inspects the writer’s NSError, and if the message indicates out-of-disk it fires a single macOS notification: Disk full — audio write failed. Recording may be losing audio. Free space and stop the recording.

The notification is deduped per recording session — one alert, not a spam stream. Free space and stop the recording manually; the partial .m4a up to the failure point is still valid and goes through the normal transcribe pipeline on stop.

Audio retention cleanup

Old .m4a recordings get deleted automatically; notes referencing them are untouched.

At the start of every recording, lognote sweeps $LOGNOTE_AUDIO_DIR for .m4a files older than LOGNOTE_AUDIO_RETENTION_DAYS (default 90) and deletes them along with their sidecars (.transcript.json, .markers.json, .silence.json). Notes that reference the audio via [[wikilink]] are never touched — the wikilink just becomes a broken link, which Obsidian shows as red so you know the source is gone.

Set LOGNOTE_AUDIO_RETENTION_DAYS=0 to disable cleanup if you’d rather hold on to everything (or run your own pruning). The cleanup is opportunistic, not scheduled — it only runs when you start a new recording, so a long stretch without recording won’t churn through old files.

transcription

Transcription behaviors to expect

Track-split diarization, hallucination filtering, cross-track dedup.

Track-split diarization

The two-track m4a gives you free 2-label diarization without a separate diarization model.

mlx-whisper doesn’t do speaker diarization natively, so lognote leans on the m4a’s structure: track 0 is system audio (everyone else, captured via the CoreAudio process tap), track 1 is your mic. Each track is extracted with ffmpeg -map 0:a:<N>, transcribed independently by mlx-whisper, tagged with speaker: "me" or speaker: "others", and the two streams are interleaved by start timestamp before format-transcript.py renders the result.

See the me/others entry for the v1 limitation: remote participants in a Zoom call all pool under others. The transcription/silence-hallucination-filter and cross-track-dedup-and-vad entries cover the two passes that clean up the artifacts this approach introduces (Whisper hallucinations on silent tracks, loudspeaker bleed on the mic track).

Silence-hallucination filter

Drops the phantom "Thank you." / "Bye." segments Whisper invents on silent audio.

Whisper is known to invent text on silent audio — “Thank you.”, “Thanks for watching.”, “All right.”, “Bye.” — particularly when a chunk is near-silent. Before track-split, the mic and system streams were mixed so a hallucination on one track was masked by real audio on the other. Now that each track is transcribed independently, silent stretches surface those phantoms directly under whichever speaker label that track represents.

The engine catches them with a heuristic: a curated list of common hallucinated phrases, combined with suspicious-duration and no-speech-probability gates so genuine utterances aren’t dropped. Short conversational fillers (“okay”, “yeah”, “thank you”, “bye”) only count as a phantom when they are the whole segment — a real sentence that merely contains one of these words (“Okay, so let’s move the launch to Friday…”) is never dropped, no matter how long it runs. The longer YouTube-style artifacts (“thanks for watching”, “subscribe to my channel”) are matched anywhere in the text, since they don’t occur in real meeting speech. The filter runs on each track before tagging and interleaving, so the merged transcript you see in your note is already clean.

Cross-track dedup and mic VAD

When recording on speakers (not headphones), the mic picks up the speaker output and Whisper double-transcribes the same speech. Two filters drop the bleed.

If you record a Zoom call with the audio coming out of your laptop speakers (not headphones), the mic captures the speaker output too — so Whisper transcribes the same speech twice, once on the cleaner others track and once as bleed on the me track. Two filters handle this:

The text-similarity dedup pass compares every me segment against others segments within ±5s (Whisper drifts a few seconds between tracks on long recordings). When the Szymkiewicz–Simpson overlap coefficient is ≥0.7, the me segment is dropped as a duplicate. Tunable via LOGNOTE_DEDUP_SIMILARITY_THRESHOLD / LOGNOTE_DEDUP_TIME_WINDOW; disable entirely with LOGNOTE_DEDUP_CROSS_TRACK=0.

The mic VAD pass is a belt-and-suspenders second filter for residual bleed that escapes the text-similarity heuristic. It uses ffmpeg silencedetect to find silence intervals on the mic track, then drops any me segment whose audio range was ≥70% silent — if your mic was silent during a span, anything Whisper transcribed there has to be acoustic leakage. Disable with LOGNOTE_MIC_VAD_ENABLED=0.

permissions

Permissions

Two TCC grants on first run, and how to re-test them.

Microphone and System Audio Recording permissions

Two click-Allow prompts on first run, both attributed to "Lognote" in System Settings.

The first time you hit record, macOS shows two click-Allow TCC prompts, both labeled Lognote: one for Microphone (the obvious one) and one for System Audio Recording (gates the CoreAudio process tap that captures audio from other apps). Two separate prompts because macOS treats them as distinct TCC services with different threat models — there’s no single combined grant.

Both prompts come from Info.plist usage descriptions on the bundled Lognote.app. Re-signing the bundle with the same identifier (com.shariqh.lognote.recorder) preserves prior grants across rebuilds, so setup.sh upgrades don’t make you re-grant. To re-test the permission flow:

tccutil reset Microphone   com.shariqh.lognote.recorder
tccutil reset AudioCapture com.shariqh.lognote.recorder

Note the tccutil service name for System Audio Recording is AudioCapture, not SystemAudioCapture or kTCCServiceAudioCapture — verified empirically on macOS 26.4.

install

Install, upgrade, uninstall

One command up, one command down. No drift.

One-command setup

./setup.sh handles brew prereqs, venv, Swift build, model download, and config in one shot.

./setup.sh is the single entry point. It verifies prereqs (Apple Silicon only — Intel is hard-rejected), brew-installs python@3.11 / ffmpeg / node if missing, builds the .venv with the pinned mlx-whisper version, builds the Swift Lognote.app bundle (preserving its bundle ID so prior TCC grants survive), builds the menu bar helper, builds the Obsidian plugin, pre-downloads the ~2 GB local-mlx default model, installs the ~/.local/bin/lognote-record-{start,stop} PATH wrappers, and runs an interactive config block for vault path and LLM provider credentials.

Re-running ./setup.sh is an upgrade — every step is idempotent. If setup.sh itself changed in the pull, it auto-re-execs the new version, so a single invocation always applies the latest install flow. Useful flags: --non-interactive (skip prompts, CI-friendly), --reconfigure (re-prompt even if config exists), --skip-pull (don’t git-pull, useful on feature branches), --skip-llm-download (defer the ~2 GB model fetch).

Distribution channels

Homebrew, curl-piped installer, or plain git clone — all three end at the same setup.sh.

Three ways to get lognote on your machine:

Homebrew: brew tap shariqh/tools && brew install lognote. The formula clones to ~/dev/lognote and runs setup.sh. Future updates: brew upgrade lognote.
One-line installer: curl -fsSL https://raw.githubusercontent.com/shariqh/lognote/main/install.sh | bash. Equivalent to git clone + setup.sh. Override the destination with LOGNOTE_INSTALL_DIR=...; pass setup flags via LOGNOTE_INSTALL_FLAGS="--non-interactive".
Git clone: git clone git@github.com:shariqh/lognote.git ~/dev/lognote && cd ~/dev/lognote && ./setup.sh. The most direct path; recommended if you want to track a branch or read the source before installing.

All three converge on setup.sh, so install behavior is identical regardless of which channel you used. While the repo is private, the Homebrew and curl paths need a GitHub token with repo scope (gh auth login once, then re-run) — this caveat disappears when the repo goes public.

Uninstall and reinstall

./setup.sh --uninstall removes integrations with per-step confirmation; never touches your recordings or notes.

./setup.sh --uninstall walks through every integration lognote installed on your machine — PATH wrappers, plugin symlink in the vault, state directory, menu bar app, brew-installed dependencies, the cloned repo itself — and asks for confirmation on each one before removing it. Pass --non-interactive to auto-confirm every prompt (use carefully).

./setup.sh --reinstall is sugar for “uninstall non-interactively, then install” — useful when something’s wedged and you want to start from a clean slate. Crucially, neither flag ever touches your audio recordings or vault notes. Uninstall removes the tooling that produced them; the content itself stays where it is.

Guided walkthrough

The Mac app's setup tour: vault auto-detected or chosen, plugin installed, permissions granted, first note recorded — models prepare while you learn.

The Lognote Mac app (the drag-install .dmg) opens into a guided walkthrough the first time you launch it. Instead of a wall of settings, setup is a short tour:

Vault detection. Lognote reads Obsidian’s vault registry and, when the answer is unambiguous — exactly one vault installed, or exactly one currently open — configures it automatically and moves straight on, without asking. Only when several vaults are present and none is clearly active (or none is found) does it stop to ask: pick from the detected list, or browse to your vault folder. The folder has to be a real Obsidian vault — one containing a .obsidian directory — or it’s rejected.
Install the Obsidian plugin. One step puts the recording controls inside Obsidian; the tour waits until Obsidian confirms the plugin is up, with live hints if anything needs a manual toggle. If Obsidian starts in Restricted Mode (which blocks community plugins by default), the tour surfaces the exact toggle to turn it on — Settings → Community plugins → Turn on community plugins — and waits for you to enable Lognote.
Grant permissions. macOS asks twice — once for the microphone, once for system audio — because lognote captures both sides of your calls. The tour runs a test capture (~8 seconds, plays a chime so the system-audio track has something to hear) and checks both tracks. Missing or denied microphone access and a missing system-audio authorization are hard blocks — those are required for recording to work. If the system-audio track was authorized but captured only silence during the test (sometimes happens with unusual audio routing), the tour shows a warning and lets you continue anyway; your microphone still records, so you can try a real recording and see whether the other side comes through.
Models prepare while you wait. Recordings are transcribed entirely on your Mac, so the first launch downloads and optimizes the on-device models in the background. A “Preparing models…” strip at the bottom of the step shows progress while that runs. Meanwhile the tour shows a sample note — summary, action items, decisions, open questions, and the speaker-labeled transcript — so you know exactly what you’ll get before you’ve recorded a thing. (Summaries can also use a cloud provider you configure; the audio itself stays on your Mac.)
Record your first note. A three-line instruction plus an optional ~20-second read-aloud script, so your first real note lands while the tour is still open to celebrate it. The Back button is on the left; Finish is on the right.

Picking up where you left off. If you relaunch (or re-run) the tour after some steps are already done — vault chosen, plugin loaded, models downloaded — it doesn’t blink past them. A brief summary screen checkmarks everything that’s already set and points at the one step that still needs you, then lands there. The progress dots along the top turn into checkmarks as each step completes, and a Back button lets you step back through finished steps to review or change them without the tour pulling you forward again.

Re-learn the tour anytime. Lognote menu → Reset Walkthrough restarts it from the top. Your settings, downloaded models, and permission grants are untouched — it only forgets that you’ve seen the tour.

The home window. After the walkthrough, launching the app shows a tabbed settings manager with four tabs: General (setup status at a glance — permissions, model warmup, plugin install — with a “Re-check permissions” button and “Re-run walkthrough”); Transcription (WhisperKit model and language); Summarization (provider, credentials, “Test connection”, auto-order); and Recordings (audio retention). Changes write to ~/.config/lognote/env, the authoritative config source. The Obsidian plugin no longer carries provider or model settings; configure everything here.

Automatic updates

lognote keeps itself current via signed, over-the-air updates that are cryptographically verified before installing.

lognote checks for updates automatically — once a day, in the background. When a new version is available, it lets you know with a quick prompt; one click downloads the cryptographically verified update and installs it. You don’t need to visit a download page or run a script.

What updates, what stays put

The update replaces the app and refreshes the Obsidian plugin bundled with it (see below). Your notes, recordings, configured providers, API keys, and the on-device models are all untouched. The audio-stays-on-your-Mac promise carries forward unchanged through every update.

Keeping the Obsidian plugin in sync

The Obsidian plugin ships inside the app, so each app build carries a matching plugin. When you launch the app it keeps your vault’s plugin matched to the installed app — normally that means copying in a newer plugin after an update, but it also means restoring the matching plugin if you ever roll the app back, so the plugin and the app version it talks to never drift apart. You don’t reinstall anything. Obsidian keeps running the previously loaded plugin until it’s reloaded, so the next time you’re in Obsidian you’ll see a quick “lognote updated — reload to apply” prompt with a Reload now link. One click swaps in the matching plugin; until then, recording keeps working with the version already loaded.

Verification

Every update is code-signed with a Developer ID certificate and notarized by Apple before it’s published. On top of that, lognote uses Sparkle’s Ed25519 signature scheme: each download is signed with a private key that never leaves the maintainer’s control, and the app verifies the signature cryptographically before installing anything. A tampered or corrupted file is rejected.

Checking manually

You can also check for an update on demand: Lognote menu → Check for Updates. The app will contact the update server, report whether a new version is available, and install it if so.

Background checks

By default, lognote checks for updates once a day in the background. No action is needed for these to run.

Feature tour

An opt-in, dismissible tour of Lognote's features — offered after setup and reachable any time — that surfaces only what's new for returning users.

After the setup walkthrough lands your first note, Lognote offers an optional tour of what it can do. It’s never required: a small, dismissible prompt — “Take a 1-minute tour of what Lognote can do?” — appears alongside the first-note celebration with Take the tour and Maybe later. Finishing setup works exactly the same whether or not you take it. (You can also reach the tour, or skip straight to Obsidian, right from the first-note step — before you record anything.)

The tour is a short, card-per-feature walkthrough — Next, Back, progress dots, and a Done. Each card explains one feature in a sentence or two and, where it helps, a concrete “try it now” nudge.

What the tour covers

The current tour walks through the shipped feature set:

Recording basics — click the mic icon in Obsidian’s left ribbon to start, click it again to stop; a marker drops into your note and the transcript lands there when it’s done. There’s no default keyboard shortcut — bind one yourself under Settings → Hotkeys if you’d like.
Both sides, no bot — Lognote captures your microphone and the call’s audio together and labels them “me” and “others.” Nothing joins the meeting, and the audio is transcribed on your Mac.
Auto-stop on silence — a recording you forget to stop ends itself after 5 minutes of silence on both sides, with a heads-up notification 60 seconds before.
Record, then carry on — the moment you start recording you’re free to switch notes, rename this one, even quit Obsidian; the recorder runs on its own and the transcript follows its marker back into place.
Troubleshooting & recovery — if a summary can’t be generated the transcript still lands and the failure is called out (fix the cause, then re-run the summary), and if a note wandered off mid-transcription you can scan for orphaned markers to re-home its transcript.
Import & reshape transcripts — bring in a transcript you already have (a .vtt/.srt caption file, an Otter export, a meeting tool’s download) and Lognote formats and summarizes it like a recording you made here.

The Import & reshape card appears only when the matching capability is present in your install, and it lists only the parts that are actually available — so it never advertises a feature you don’t have. Reshaping recordings (splitting one long recording into separate notes, or joining recordings that broke across a dropped call) appears on the same card where those capabilities are present. Cards like this light up automatically once their capability ships — there’s no separate update to chase.

Only what’s new — no forced replays

The tour is built so a returning user is never marched back through steps they’ve already seen. Lognote remembers which cards you’ve been through (separately from the setup walkthrough’s own state). If a later update adds a new feature, the next launch surfaces an opt-in “We added something new — take a look?” rather than replaying the whole tour. If you’ve already seen everything, you’re not prompted at all.

Re-open it any time

You can open the tour whenever you like from the Lognote app’s General tab — the Feature tour button is always there. When there are cards you haven’t seen yet (for example, after an update added a feature), a small New badge appears beside it. Declining (“Maybe later”) just stops the prompt from nagging — it doesn’t hide the tour from the General tab.

that's the lot

Missing something you expected to see? Email hello@lognote.dev . The handbook tracks the product, so if it's here it ships, and if it ships it should be here.