The App That Wasn't

The App That Wasn't

Last night my wife and I had a date night with Claude. (Yes, that’s a thing now. I wrote about it yesterday.) We were designing a family assistant — an OpenClaw agent that could handle scheduling, reminders, party logistics, all of it. Great session. Tons of ideas. Afterward I wanted to capture the highlights, so I fired off a message to Obi using the /idea skill for the first time.

Here’s what I sent:

date night with Claude was great. We planned out our Family Assistant OpenClaw design, we talked through the challenges of accessing data and having it structured enough to be usable. We talked through challenges with privacy for both personal data and enterprise data. Also the limitations of current channels for OpenClaw, anyone have a mod for WeChat? I think we’re going to build a lot of cool things this year. The party is at KidsQuest next weekend so bring a change of clothes in case the kids get wet. Send a reminder 5 days before, buy a gift — it looks like they are into Paw Patrol.

I was thinking like someone filling out a form. Dump the whole thing in, let the skill save it as a text blob. That’s what an app would do. Capture the string, store it, done.

The party details at the end? That was me giving an example of what the family assistant should handle — the kind of thing you’d want to say to it naturally. “The party is next weekend, remind me to buy a gift.” I was describing the product we’d just designed, not giving Obi instructions.

Obi didn’t see it that way.

He parsed my message as two things: an idea and a set of action items. He saved the brainstorm about the Family Assistant design. Then he treated the party details as real tasks — set a reminder for March 2nd, noted the change of clothes, flagged the Paw Patrol gift. I didn’t ask for that. I was brainstorming about a future product, and Obi decided the example was actually a request.

The funny thing is — he wasn’t wrong. We really do have a party at KidsQuest next weekend. The reminder is actually useful. But I didn’t mean it as an instruction. I was thinking in old-world mode, where the context of “this is an example” would be obvious because a form can’t act on it anyway. Obi doesn’t have that limitation. He read the intent behind the words and acted — even when the intent he found wasn’t the one I had.

This is the shift that’s easy to miss because it looks small. It’s not a feature announcement. It’s not a new model capability. It’s a moment where the interface disappeared and the meaning came through — just not exactly the meaning I intended.

The same thing happened earlier that night when I was testing the /add task routine. I sent two /add commands in a single message. Parsed perfectly — two separate tasks created. Then I marked one /done with a misspelled, shortened version of the task title. Found the right task anyway. Marked it complete.

Every one of those interactions would have failed in an app. Misspelled input? No match. Two commands in one message? Error. A paragraph that’s half brainstorm recap and half product example? Pick a field.

Here’s the thing though — this is a double-edged sword. The same flexibility that makes natural language powerful makes it unpredictable. When the system infers intent, sometimes it infers an intent you didn’t have. The upside is enormous, but the blast radius of a misparse is real. An app that does exactly what you tell it is annoying but safe. An agent that does what it thinks you meant is powerful but needs guardrails.

That’s why the pattern I keep coming back to is act-then-verify. Let the agent parse, infer, and execute — but build the review step into the workflow. Not as a speed bump. As a feature. The surgeon doesn’t skip the checklist because they’re confident. The checklist is what makes the confidence useful.

We’re not building apps anymore. We’re building systems that understand. The design challenge isn’t the UI. It’s the trust model.