A first-person account of DingDing's ONE, an AI work product that went from 0 to 1 to sunset. We read it through one question: what does this teach us about shipping AI on Edge mobile?
Not a press release, not a retrospective deck. A first-person account of why the team made each call, what was loud in the room, and what they only saw later. We're reading it as a structural map of the traps Edge mobile is now walking through.
Boutique products can be opinionated. Platforms can't. They have to carry millions of workflows. ONE picked aggressive, founder-driven defaults and called them "design conviction." The 95% of users who never customize anything just experienced them as the product.
小餐馆可以坚持「我就这样出品」,美团不行。美团要承载千千万万种吃法。《置身钉内》 · Design Chapter
Boutique products can be opinionated ("come if you like our menu"). Platforms must carry millions of workflows. ONE picked aggressive, founder-driven defaults (auto-mark read, system-decided ordering, Discover always-on) and called them "design conviction."
Defaults on the new tab page, omnibox behavior, default-on AI features, notification policy: to the 95% of users who never open Settings, those don't read as "design opinions." They read as the product.
Before any default ships, answer on the record: "If a user only ever sees this as the default and never customizes it, do they thank us or curse us?" If the team can't agree → ship as opt-in.
这个入口很容易误触,以往习惯点这里去消息 tab,被强行影响了使用习惯。《置身钉内》 · User Chapter
ONE placed a large AI entry point in DingDing's bottom-left corner. Internally, the team felt they'd chosen the "milder" option. Externally, users felt one thing: the spot they reached for every day no longer did what it used to.
Any AI surface that replaces or shifts a familiar control (address-bar action, bottom tab, new-tab default) lands as loss of an old habit in week one. The "new feature" framing only registers later, if at all.
Every entry-point change needs an explicit "habit-loss" budget: what was the old action worth in MAU terms, and how long is the rollback window if the new function under-delivers?
Mobile compresses your design vocabulary down to cards, gestures, peeks. ONE's deepest losses came from patching the symptoms instead of revisiting the root. The aesthetic of "card-feel" hid a hard fact: every card surfaced was a responsibility surfaced.
我们很敏捷地修补了很多地方,却始终没有敏捷地回到最初的判断。这就像房梁歪了,却每天换窗帘、擦地板、调灯光。《置身钉内》 · Agility Chapter
ONE auto-marked messages as read when users scrolled past cards. To soften the fear, the team stacked patches: horizontal peek, a "Peekaboo" gesture for pre-read, summaries, undo paths. None asked the real question: should the system auto-mark read at all?
When an AI feature needs preview, undo, confirmation, anxiety-reducers, the root design is probably wrong. Patches make critique invisible; they don't make the design right.
Track a patch ratio per feature: number of post-launch mitigations ÷ shipped value. When it crosses 3, trigger a root-design review. Skip the next patch sprint.
工作产品里的「看见」从来不是中性的。看见一条消息,可能意味着已读;看见一个待办,可能意味着责任。《置身钉内》 · Design Chapter Summary
ONE fell in love with "card-feel" as an aesthetic. But every card surfaced was a responsibility surfaced. Seeing the card changed user state, regardless of whether the user wanted that state changed.
AI summaries, tab group suggestions, recommended actions, Copilot nudges: each one tells the user "you now know about this." Screen real-estate is the easy cost. The cognitive and accountability cost burns trust.
Add a responsibility audit to every new mobile surface: what does the user now feel obligated to do that they didn't a minute ago? If the answer is "nothing useful" → cut the surface.
Three traps eat user research: surveys that reward politeness, beta cohorts pre-loaded with context, and abstract questions that get abstract answers. Build for the silent behavior, not the spoken one.
一个用户说「这个功能不错」,却不愿意多点一次,不愿意改变路径……那么这个「不错」就很轻。《置身钉内》 · User Chapter
"Great," "I'd use it," "valuable direction" all decode to zero. The real signals are costs the user is willing to pay: extra taps, habit changes, workflow migration, exposing private data, recommending to teammates.
Survey scores about Copilot satisfaction are vibes. Did the user reach for it unprompted? Return within 2 weeks? Leave the default search untouched? That's the data.
Separate stated metrics (surveys, NPS, self-report) from behavioral metrics (W2 retention, unprompted invocations, undo rate, switch-away). Review them on different days, by different people, in different decks.
内测玩家会替产品补全意义,正式用户只验收眼前价值。前者体验的是产品加服务,后者体验的仅仅是产品本身。《置身钉内》 · User Chapter · Shedow MMO case
The author cites NetEase's Shedow: 6 years, 600 people, sky-high beta scores from a curated cohort, then dead within 18 months of launch. Beta players got briefings, context, gifts; real players got the bare product and bounced.
Insider, Dogfood, employee channels are pre-loaded with our intent, our docs, our explanations. Their behavior has almost no predictive power for GA. Treat as direction, never as volume.
Always tag insider data. Never use it as a "GA-ready" gate. The real GA signal is the first cohort of strangers, with no onboarding, in their first 60 seconds.
理论上,没有人会反对学习……可当用户正在处理消息时,系统突然把他带进一个学习流,实际感受可能就是「像广告」「占地方」。他并没有撒谎。《置身钉内》 · User Chapter · The Mom Test
Ask "would you want learning content?" → users say yes. Drop a video feed into the workflow → users call it spam. Both honest. The abstract question simply has no relationship to the concrete moment.
Stop asking "do you want AI summaries on mobile?" Ask: "Walk me through the last long article you read on your phone. What did you do? Was Copilot involved? Why or why not? What did you do next?"
Default every interview script to the Mom Test rule: past behavior, specific events, real costs paid. Ban hypotheticals. If a question starts with "would you," rewrite it.
"Agile" can either accelerate learning or accelerate theater. ONE became a case study in the second mode: every day a new build, every week a new ship, and the team's morale ground out by year-end.
它偏爱:今天能看见的,今天能截图的,今天能被老板验收的,今天能写进 changelog 的……不喜欢:需要长期建模的,需要打通底层数据的,一开始看不出效果、但半年后决定上限的。《置身钉内》 · Agility Chapter
A "daily build" cadence becomes the org's accounting system. Personalization, memory, feedback loops, permission and audit: none of these book today, but all of them decide the product's ceiling. They get tagged "important not urgent" and slip forever.
Sprint demos and leadership reviews can quietly retune team priorities from "right thing" to "visible thing." The result is technical debt with a happy changelog.
Carve out a protected 20–30% of sprint capacity for "ceiling work" (profile-based personalization, cross-device memory, preference loops). Separate burndown. Don't trade for demo-friendly tasks.
为了让功能更像 90 分,系统选择了最容易得分的一类关系。题答对了,场景却变小了……把用户的脚修成鞋的形状。《置身钉内》 · Agility Chapter
ONE's "unread-but-unanswered" Agent had to hit a launch-quality bar. To raise precision, the team restricted scope to direct-manager messages. The number went up; user value went down. Real forgotten messages live in customer groups and offhand peer asks, exactly the messy cases the model now ignored.
Any AI feature where precision becomes the ship-gate will quietly shrink its own scope. Watch for "we hit 92% accuracy" hiding "on 4% of real-world cases."
Always ship precision × coverage together. "75% accuracy across 80% of real cases" is almost always more valuable than "92% across 5%." Make coverage first-class in launch readiness.
人不怕累到极点,人怕累了很久,却不知道自己赢在哪里。《置身钉内》 · Agility Chapter
ONE's team shipped daily and burned out anyway. The author's diagnosis: agility had stopped serving learning and started serving proof. The team was busy but unsure where they were winning.
As a PM, one of your invisible jobs is to define and bank victories. Launches and OKR percentages don't count. A victory is a moment where "we understand the user better than last sprint" is provably true.
End every sprint review with a victory check: did we learn something concrete about a real user this sprint? Three sprints with no answer → you're not running agile, you're running on a treadmill.
Each option turns the 10 lessons into a concrete artifact you can put in front of the Edge mobile team this week.
Convert the 10 lessons into 12–15 review questions a team must answer before any AI surface ships. ("Is this a default?", "Was the beta cohort filtered?", "Is the metric narrowing scope?")
A quantifiable "victories vs. treadmill" scorecard for sprint reviews. Turns Lesson 10 into a recurring practice instead of a feeling.
Pick a single lesson (for example, "Edge mobile's default-power map" or "How to separate Insider data from GA signal") and we'll expand it into a working memo.