Marily’s AI Product Academy Newsletter

Marily’s AI Product Academy Newsletter

Prompt Engineering Is Dead. Good.

Here's what actually matters now

Marily Nika's avatar
Marily Nika
Jun 22, 2026
∙ Paid

For years, the narrative was: the better your prompt, the better your AI.

So everyone optimized prompts. Tried new techniques. Wrote longer context windows. Spent cycles on phrasing.

LinkedIn was full of “prompt engineering frameworks.” Twitter was full of techniques. There were courses. Workshops. People billing as “prompt engineers.”

It made intuitive sense: better input = better output.

Turns out, that’s the visible 10% of the problem.

The real work is system design. Architecture. Constraints. Guardrails. Feedback loops. What happens when the prompt fails (it will). How the system recovers. Whether the failure is logged, surfaced, or silently wrong.

A janky prompt inside a well-designed system beats a perfect prompt inside a brittle one every single time.

Here’s What Actually Changed

We stopped thinking “write a better instruction” and started thinking “design a system that works even when the instruction isn’t perfect.”

That’s the actual skill. That’s where PMs win or lose.


I watched a team ship an AI feature with an elegant, carefully crafted prompt. It broke constantly. Users didn’t trust it. They stopped using it.

I watched another team ship a janky prompt—honestly, kind of a mess—that held up under chaos. Users forgot it was AI. It just worked.

Same models. Different approaches.

The difference wasn’t the prompt. It was everything else.

The 10% vs. The 90%

The prompt is the visible 10%. You can see it. You can edit it. You can measure changes to it.

The system is the invisible 90%.

Input validation. Constraint enforcement. Output verification. Error handling. Monitoring. Feedback loops.

This is where the real work lives.

But it’s unglamorous. It’s not something you tweet about. It’s not a technique. It’s architecture.

So teams skip it. They obsess over the prompt instead.

And then they’re surprised when the feature breaks in production.


What “System Design” Actually Means

1. Input Validation

Before the prompt ever runs, ask: is this the right input? Right format? Right length?

If it doesn’t meet the constraint, reject it. Transform it. Ask for clarification.

The prompt never sees bad data.

2. Constraint Architecture

You don’t ask the model to stay within bounds. You enforce the bounds.

Instead of: “Please summarize in exactly three bullet points, no more than 50 words each” (the model ignores this 20% of the time)

You: validate the output has exactly three bullets, count the words, reject and retry if it breaks.

3. Graceful Failure

The prompt will fail. So what?

Plan for it. Retry logic. Escalation. Fallback. Timeout. How does the system recover?

Decide this in system design. Don’t leave it to chance.

4. Monitoring

Your prompt works 95% of the time. That means 5% of outputs are wrong, and you don’t see them.

So the system logs. Flags suspicious patterns. Alerts when confidence drops. You measure what’s actually happening.

5. Learning

Every real-world failure teaches you something. The system captures it. Adjusts. Improves.

The prompt stays the same. The system gets smarter.


Why This Matters for You

If you’re building an AI feature, you can optimize prompts forever and ship something fragile.

Or you can get the system design right and ship something boring that works.

Boring wins.

Users don’t care how clever your prompt is. They care that it works every time. That failures are handled. That the whole thing is predictable.

The teams shipping robust AI features aren’t the ones with the best prompt writers. They’re the ones thinking like systems engineers.


The Shift

  • 2023: “How do I write better prompts?”

  • 2024: “How do I build systems that work even with imperfect prompts?”

  • 2025: “How do I design systems that get smarter from real-world failures?”

The best teams are already here.

If you’re still in 2023, you’re behind.


What You’re Actually Building

Think of your AI feature as layers:

Top layer (visible): The prompt.

Middle layers (invisible): Validation. Constraints. Verification. Error handling. Monitoring.

Bottom layer (critical): Feedback loops. Learning. Iteration.

Most teams obsess over layer 1. That’s backwards.

The prompt is important. But it’s not the bottleneck. The system is.


One Question

Before you ship an AI feature, ask yourself:

“If the prompt breaks tomorrow, does this feature still work?”

If the answer is yes, you’ve got system design.

If the answer is no, you’ve got a fragile feature waiting to break.

Fix it now, not in production.


This is what we teach in my award winning AI Product Certification (summer cohorts are $1k off)—seeing the invisible layers before they break, and designing systems that don’t depend on perfect prompts.

Worth a look if you’re shipping AI features.


What To Monitor First

Don’t over-engineer this. Start with the three things that matter most.

1. Confidence vs. Accuracy Divergence

User's avatar

Continue reading this post for free, courtesy of Marily Nika.

Or purchase a paid subscription.
© 2026 Marily Nika · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture