Runtime Rescue: when the infra guesses for the model

מה Runtime Rescue ניסה לפתורWhat Runtime Rescue tried to solve

בגרסאות הראשונות של Hive Agent, המשתמש כתב משהו כמו "תבנה לי אתר לחנות תכשיטים", ואנחנו רצינו שזה יעבוד גם אם המודל פספס את ה-tool call של build_and_deploy. כתבנו פונקציה ב-routes/client.js שעשתה second-pass על ההודעה: regex על מילות מפתח, סקור על האורך, בדיקה אם יש שם מותג, ואם הציון עבר סף — דחפנו build לתור בעצמנו.

הקוד נראה הגיוני בזמנו:

function maybeRescueBuild(message, ctx) {
  const score = scoreBuildIntent(message);
  if (score >= 0.7 && !ctx.recentBuildEnqueued) {
    return enqueueBuild({
      userId: ctx.userId,
      prompt: message,
      source: 'runtime_rescue'
    });
  }
  return null;
}

זה אכן הציל כמה turns שבהם המודל "שכח" לקרוא לכלי. הבעיה הייתה שזה גם הציל turns שהמשתמש בכלל לא ביקש להציל. כל אזכור של המילה "אתר" או "לבנות" יכול היה להפעיל את זה. זו היוריסטיקה שמנסה לקרוא כוונות מטקסט חופשי, וזה תמיד מסתיים באותה צורה.

In the early versions of Hive Agent, a user would type something like "build me a jewelry shop site", and we wanted that to work even if the model forgot to emit the build_and_deploy tool call. So we wrote a function in routes/client.js that did a second pass on the message: regex on keywords, length scoring, checking for brand names, and if the score crossed a threshold we enqueued a build ourselves.

The code looked reasonable at the time:

function maybeRescueBuild(message, ctx) {
  const score = scoreBuildIntent(message);
  if (score >= 0.7 && !ctx.recentBuildEnqueued) {
    return enqueueBuild({
      userId: ctx.userId,
      prompt: message,
      source: 'runtime_rescue'
    });
  }
  return null;
}

It did rescue a handful of turns where the model genuinely forgot to call the tool. The problem was that it also rescued turns where the user never asked to be rescued. Any mention of "site" or "build" could trip it. This is a heuristic trying to extract intent from free text, and it always ends the same way.

איך זה ייצר פרויקטי-רפאיםHow it spawned ghost projects

המקרה שהפיל את הסבלנות שלנו: משתמש פתח שיחה עם פרויקט קיים, עבד עליו עשר דקות, ואז כתב "בוא נחזור לאתר הראשי שבנינו". הוא התכוון: שב-context של הפרויקט הנוכחי, חזור לדבר על דף הבית. ה-Rescue ראה את המילים "לבנות" ו-"אתר" וקיבל ציון 0.78. נכנס build חדש לתור, נוצר project_id חדש, הוקצה קונטיינר חדש, והמשתמש קיבל אתר זר שלא ביקש.

בלוגים זה נראה כך:

{
  "event": "build_enqueued",
  "source": "runtime_rescue",
  "score": 0.78,
  "existing_project_id": "prj_8821",
  "new_project_id": "prj_8847",
  "prompt": "בוא נחזור לאתר הראשי שבנינו"
}

ההיוריסטיקה לא ידעה מה זה existing_project_id. היא ראתה רק את הטקסט. בכל יום היו 30-50 כאלה — לא הרבה באחוזים, אבל מספיק כדי לזהם את ה-DB ולגרום ל-project_id ב-session להחליף את עצמו ברגעים לא נעימים. פרויקט-רפאים אינו רק רשומה מיותרת — הוא קונטיינר רץ, נפח דיסק, ומבלבל את ה-routing של ההודעה הבאה.

The case that broke our patience: a user opened a chat tied to an existing project, worked on it for ten minutes, then typed "let's go back to the main site we built". He meant: stay in this project's context, return to the home page topic. Rescue saw "build" and "site" and scored it at 0.78. A new build was enqueued, a fresh project_id was minted, a fresh container was provisioned, and the user got back a foreign site he never asked for.

In the logs it looked like this:

{
  "event": "build_enqueued",
  "source": "runtime_rescue",
  "score": 0.78,
  "existing_project_id": "prj_8821",
  "new_project_id": "prj_8847",
  "prompt": "let's go back to the main site we built"
}

The heuristic had no idea what existing_project_id meant. It only saw text. We were getting 30-50 of these a day — small in percent, large enough to pollute the DB and let the session's project_id swap itself underneath the user at unpleasant moments. A ghost project is not just an extra row; it is a live container, occupied disk, and a confusing routing target for the next message.

למה היוריסטיקת-כוונה תמיד נשברתWhy intent heuristics always break

סיווג כוונה מטקסט חופשי הוא בעיה פתוחה. בכל פעם שאתה מוסיף חוק כדי לכסות מקרה חדש, אתה שובר שניים אחרים. ראינו שלושה דפוסי כשל קבועים:

מילים זהות, כוונה הפוכה: "בוא נבנה את זה ביחד" (חיובי) מול "כבר בנינו, לא צריך שוב" (שלילי). אותן מילים, ניקוד דומה, החלטות הפוכות.
הקשר שאינו במחרוזת: ההיוריסטיקה רואה הודעה אחת. למשתמש יש סיפור של 30 הודעות. שלוש מילות מפתח לא מתחרות בהקשר אמיתי.
חוסר אחריות: כשההיוריסטיקה טועה, לא ברור אם המודל אשם או הקוד שלך. ה-debugging הופך לחקירה זוגית.

היד הטבעית של מהנדס היא להוסיף עוד תנאי. "אם יש project_id קיים בסשן, אל תפעיל". "אם יש את המילה 'הקיים', אל תפעיל". כל תיקון כזה הוא תיקון נקודתי שלא נגמר. השאלה הנכונה היא לא "איך אני משפר את ההיוריסטיקה?" אלא "למה אני מנחש בכלל?". אם המודל יודע מתי לבנות, יש לו כלי לזה. אם הוא לא יודע, אין רלוונטיות לכך שהקוד שלך כן.

Intent classification from free text is an open problem. Every rule you add to cover a new case breaks two others. We saw three stable failure patterns:

Same words, opposite intent: "let's build this together" (positive) vs. "we already built it, no need again" (negative). Same vocabulary, similar score, opposite decision.
Context that lives outside the string: the heuristic sees one message. The user has a 30-message arc. A handful of keywords cannot compete with real context.
Diffuse blame: when the heuristic misfires, it is unclear whether the model or your code is at fault. Debugging becomes a two-person investigation.

The natural engineer reflex is to add another condition. "If there's a project_id in the session, skip". "If the message contains the word 'existing', skip". Each patch is a point fix that never converges. The right question is not "how do I improve the heuristic?" but "why am I guessing at all?". If the model knows when to build, it has a tool for that. If it doesn't, your code knowing is not a substitute.

ההחלפה: project_id נעול בסשן + tool מפורשThe replacement: project_id locked in session + explicit tool

החזרנו את ההכרעה למקום היחיד שיכול להחזיק אותה: ה-session, פלוס קריאת כלי מפורשת מהמודל. שני כללים פשוטים:

project_id נעול: ברגע שמשתמש פותח שיחה עם פרויקט קיים, ה-session מחזיק את ה-id. שום הודעה, שום ציון, שום ביטוי, לא יחליף אותו. רק תהליך מפורש יכול.
build_and_deploy עם confirm_new_project=true: זו הדרך היחידה ליצור project_id חדש. אם המודל קורא לכלי בלי הדגל, ה-server משיב עם שגיאה ברורה — "זה ייצור פרויקט חדש, צריך אישור מפורש".

// routes/client.js — האחרי
app.post('/chat', async (req, res) => {
  const session = await loadSession(req.userId);
  // לא סורקים את req.body.message לכוונת build.
  const reply = await runAgentTurn({
    session,
    message: req.body.message,
    tools: getToolsFor(session),
  });
  return res.json(reply);
});

// schema של build_and_deploy
const buildAndDeploy = {
  name: 'build_and_deploy',
  description: 'Build and deploy the active project. Set confirm_new_project=true ONLY when the user explicitly asked to start a new project.',
  input_schema: {
    type: 'object',
    properties: {
      project_id: { type: 'string' },
      confirm_new_project: { type: 'boolean' }
    },
    required: ['project_id']
  }
};

תיאור הכלי לא נכתב למשתמש. הוא נכתב למודל. "ONLY when the user explicitly asked" הוא חוזה. בלוגים אחרי השינוי, builds מ-source="runtime_rescue" ירדו ל-0. פרויקטי-רפאים החדשים גם הם ל-0.

We moved the decision back to the only place that can hold it: the session, plus an explicit tool call from the model. Two simple rules:

project_id is locked: the moment a user opens a chat tied to an existing project, the session pins the id. No message, no score, no phrasing replaces it. Only an explicit transition can.
build_and_deploy with confirm_new_project=true: this is the only path that mints a new project_id. If the model calls the tool without the flag, the server answers with a clear error — "this would create a new project, explicit confirmation required".

// routes/client.js — after
app.post('/chat', async (req, res) => {
  const session = await loadSession(req.userId);
  // No scanning of req.body.message for build intent.
  const reply = await runAgentTurn({
    session,
    message: req.body.message,
    tools: getToolsFor(session),
  });
  return res.json(reply);
});

// build_and_deploy schema
const buildAndDeploy = {
  name: 'build_and_deploy',
  description: 'Build and deploy the active project. Set confirm_new_project=true ONLY when the user explicitly asked to start a new project.',
  input_schema: {
    type: 'object',
    properties: {
      project_id: { type: 'string' },
      confirm_new_project: { type: 'boolean' }
    },
    required: ['project_id']
  }
};

The tool description is not written for the user. It is written for the model. "ONLY when the user explicitly asked" is a contract. After the change, builds with source="runtime_rescue" went to 0. New ghost projects also went to 0.

איך מוציאים rescue מבלי לשבור משתמשים קיימיםRemoving rescue without breaking live users

הסרת קוד שתפס טעויות של המודל היא מסוכנת אם לא מסתכלים על השנייה. שלב המעבר אצלנו לקח שבועיים. תחילה הוספנו לוג בלי פעולה: ה-rescue רץ אבל רק רושם would_have_rescued=true. בדקנו כמה פעמים ביום זה היה משנה, וכמה מתוכן היו ghosts.

התוצאה: ב-72% מהמקרים שה-rescue היה מציל, היה גם project_id קיים בסשן — כלומר זה היה הולך להיות פרויקט-רפאים. ב-19% המודל קרא לכלי בכל מקרה turn אחד אחר כך. רק ב-9% היה fall-through אמיתי שבו ההצלה הייתה משנה. זה שווה את הנזק? לא קרוב.

אז כיבינו את ה-action, השארנו את הלוג, ואחרי שבוע של 0 תלונות חדשות מחקנו את הקוד לגמרי. ה-checklist שלנו לכל "הסרת safety net":

shadow mode: הקוד רץ, רק לא פועל. רושם מה היה עושה.
quantify the saves: בכמה מקרים זה הציל מצב אמיתי? בכמה זה גרם נזק?
גרסה מקבילה של מסלול תקין: וידאנו ש-build_and_deploy עם schema טוב מספיק חזק כדי שהמודל יקרא לו לבד.
מחיקה, לא הערה: לא משאירים // TODO: maybe re-enable. קוד מת מחזיק dependencies בחיים.

Removing code that catches model mistakes is risky if you don't check the second-order effects. Our transition took two weeks. First we added a log-only mode: rescue ran but only recorded would_have_rescued=true. We measured how often it would have fired, and how many of those would have been ghosts.

The result: in 72% of cases where rescue would have triggered, there was already a project_id in the session — meaning it would have spawned a ghost. In 19%, the model called the tool on its own one turn later. In only 9% was there a real fall-through where the rescue would have actually helped. The damage outweighed the saves by a wide margin.

So we disabled the action, kept the log, and after a week of zero new complaints we deleted the code outright. Our checklist for any "remove the safety net" change:

Shadow mode: the code runs but takes no action. It records what it would do.
Quantify the saves: how often did it actually save a real situation? How often did it cause harm?
Parallel sound path: confirm that build_and_deploy with a strong schema is enough for the model to call on its own.
Delete, don't comment: no // TODO: maybe re-enable. Dead code keeps dependencies alive and ambushes the next reader.

מתי "רשת ביטחון" כן בסדרWhen a safety net is actually fine

אנחנו לא נגד כל heuristic. ההבדל בין "רשת ביטחון לגיטימית" ל-"כוונה מנוחשת" הוא חד מאוד אצלנו. רשת ביטחון לגיטימית עומדת בשלושה תנאים:

אסימטריה ברורה של נזק: לא לעשות כלום עולה הרבה יותר מטעות-שווא. למשל: timeout שמסיים turn תקוע — הנזק של שיחה תקועה גדול מהנזק של retry מיותר.
אין שינוי state חיצוני: אם הרשת רק שולחת הודעה למשתמש או מבקשת מהמודל לנסות שוב — מותר. אם היא יוצרת רשומה ב-DB, מקצה משאבים, או שולחת מייל — אסור.
מדידה ברורה: יש מטריקה אחת שאם היא משתבשת, יודעים מיד שהרשת חוזרת לעשות נזק.

Warningהכלל החזק ביותר שלנו: שום heuristic לא ייצור project_id. רק tool call מפורש עם הדגל. כל מה שיוצר state חיצוני חדש חייב לעבור דרך פרוטוקול הכלים.

אם אתה מוצא את עצמך כותב scoreBuildIntent או guessUserWants — עצור. כתוב tool טוב יותר. שפר את ה-description שלו. הסבר למודל מתי לקרוא לו. אל תנסה לקרוא במחשבות.

We are not against every heuristic. The line between a legitimate safety net and a guessed intent is very sharp for us. A legitimate safety net satisfies three conditions:

Clear damage asymmetry: doing nothing is much worse than a false positive. Example: a timeout that ends a stuck turn — a frozen chat hurts more than an unnecessary retry.
No external state change: if the net only sends a message to the user or asks the model to retry, it's fine. If it inserts a DB row, allocates resources, or sends an email, it's not.
One observable metric: there is a single number that goes wrong the moment the net starts causing harm.

WarningOur strongest rule: no heuristic ever creates a project_id. Only an explicit tool call with the flag. Anything that mints new external state goes through the tool protocol.

If you catch yourself writing scoreBuildIntent or guessUserWants, stop. Write a better tool. Sharpen its description. Teach the model when to call it. Do not try to read minds.