{
  "newsletter_slug": "frontier-labs",
  "section": "roll",
  "slug": "202603100417_frontier_labs",
  "title": "Frontier Labs",
  "summary": "Tue Mar 3 to Tue Mar 10, 2026 (inclusive) ~1,550 words Executive synthesis Across the cycle, the frontier labs split into two visible “centers of gravity”: (1) agentic enterprise execution (OpenAI shipping GPT‑5.4’s native computer-use + tool-search, then immediately...",
  "published_at": "2026-03-10T04:17:00.000Z",
  "page_html": "<h2>Tue Mar 3 to Tue Mar 10, 2026 (inclusive)</h2>\n<p>~1,550 words</p>\n<h2>Executive synthesis</h2>\n<p>Across the cycle, the frontier labs split into two visible “centers of gravity”: (1) <strong>agentic enterprise execution</strong> (OpenAI shipping GPT‑5.4’s native computer-use + tool-search, then immediately packaging it into Excel and an AppSec agent) and (2) <strong>cost-optimized scale + multimodal pretraining</strong> (Google pushing a new low-price Gemini tier; Meta/FAIR publishing from-scratch multimodal scaling results). Overlaid on both is a sharpened <strong>state/procurement constraint layer</strong>: Anthropic’s dispute with the US government escalated into litigation and widespread agency offboarding, while OpenAI’s defense engagement triggered a high-salience senior resignation—turning “safety posture” from abstract governance into near-term distribution and talent outcomes. (<a href=\"https://openai.com/index/introducing-gpt-5-4/?utm_source=openai\">openai.com</a>)</p>\n<h2>Information (The Core)</h2>\n<h3>Theme 1 — Agents move from “demo” to “operational surface area” (computer use, tool ecosystems, and workflow closure)</h3>\n<ul>\n<li><p><strong>OpenAI</strong></p>\n<ul>\n<li><strong>GPT‑5.4</strong> launched <strong>Mar 5</strong> across ChatGPT (as “GPT‑5.4 Thinking”), API, and Codex; OpenAI positions it explicitly as a <strong>professional-work</strong> frontier model combining reasoning + coding + agentic tool use. (<a href=\"https://openai.com/index/introducing-gpt-5-4/?utm_source=openai\">openai.com</a>)  </li>\n<li>The material capability shift is <strong>native computer use</strong> (desktop/app control via screenshots + mouse/keyboard actions) plus <strong>1M-token context</strong> support (notably framed as enabling longer-horizon agents). (<a href=\"https://openai.com/index/introducing-gpt-5-4/?utm_source=openai\">openai.com</a>)  </li>\n<li><strong>Tool search</strong> is introduced to avoid front-loading large tool definitions into every prompt—explicitly targeting “large ecosystems of tools/connectors” and MCP-style tool catalogs; OpenAI reports a <strong>47% token reduction</strong> on a Scale MCP Atlas benchmark with tool search vs. exposing all tools directly. (<a href=\"https://openai.com/bn-BD/index/introducing-gpt-5-4/?utm_source=openai\">openai.com</a>)  </li>\n<li><strong>Codex Security</strong> (research preview) shipped <strong>Mar 6</strong> as an “application security agent,” emphasizing deep repo context + automated validation to reduce false positives; rollout targets <strong>Pro/Enterprise/Business/Edu</strong> with <strong>free usage for the next month</strong> (time-boxed adoption push). (<a href=\"https://openai.com/index/codex-security-now-in-research-preview/?utm_source=openai\">openai.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>Google DeepMind / Google</strong></p>\n<ul>\n<li><strong>Gemini 3.1 Flash‑Lite</strong> announced <strong>Mar 3</strong> as a preview tier for “highest-volume workloads,” stressing latency + price/performance rather than peak capability; rolled out via <strong>Gemini API (AI Studio)</strong> and <strong>Vertex AI</strong>. (<a href=\"https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/?utm_source=openai\">blog.google</a>)  </li>\n<li>The positioning implies a strategic bet that <strong>agentic volume economics</strong> (serving many tool calls / classifications / translations) becomes a primary competitive axis, not just “best model.” (<a href=\"https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/?utm_source=openai\">blog.google</a>)</li>\n</ul>\n</li>\n<li><p><strong>Anthropic (via partners, not first-party product posts in this scan)</strong></p>\n<ul>\n<li>Japan-channel partner releases repeatedly highlight “<strong>Claude Code</strong>” and a desktop agent “<strong>Claude Cowork</strong>” as a <strong>research preview</strong> being evaluated inside enterprises (notably NRI). This reads as a parallel “agents on the desktop” track—surfacing through integrators/resellers rather than a marquee global launch in this window. (<a href=\"https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai\">nri.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>Meta AI (FAIR)</strong></p>\n<ul>\n<li>FAIR+NYU’s <strong>“Beyond Language Modeling”</strong> (submitted <strong>Mar 3</strong>) is research-level reinforcement for agentic multimodal direction: action-conditioned video + world-modeling behaviors are treated as emergent properties of broad multimodal pretraining (vs. narrow robotics-only datasets). (<a href=\"https://arxiv.org/abs/2603.03276\">arxiv.org</a>)</li>\n</ul>\n</li>\n</ul>\n<h3>Theme 2 — Enterprise distribution “hardens”: integrations, resellers, and packaging become the battleground</h3>\n<ul>\n<li><p><strong>OpenAI</strong></p>\n<ul>\n<li><strong>ChatGPT for Excel (beta)</strong> announced <strong>Mar 5</strong>: an Excel add-in embedding ChatGPT directly into spreadsheets, explicitly “powered by GPT‑5.4.” (<a href=\"https://openai.com/index/chatgpt-for-excel/?utm_source=openai\">openai.com</a>)  </li>\n<li>OpenAI also added <strong>financial data integrations inside ChatGPT</strong> (FactSet, Dow Jones Factiva, LSEG, Daloopa, S&amp;P Global, etc.), signaling a strategy of winning regulated/high-stakes workflows by attaching to <strong>trusted data rails</strong> rather than relying on model output alone. (<a href=\"https://openai.com/index/chatgpt-for-excel/?utm_source=openai\">openai.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>Anthropic</strong></p>\n<ul>\n<li><strong>Japan enterprise channel expansion</strong> shows a reseller/integrator scale strategy:<ul>\n<li><strong>Classmethod</strong> (Mar 2) announced an authorized reseller agreement (Amazon Bedrock channel), bundling licensing + consulting/implementation. (<a href=\"https://classmethod.jp/english/news/260302-anthropic/?utm_source=openai\">classmethod.jp</a>)  </li>\n<li><strong>NHN Techorus</strong> (Mar 5) similarly announced Anthropic reseller status for Claude via Bedrock, emphasizing enterprise deployment support. (<a href=\"https://en.sedaily.com/technology/2026/03/05/nhn-techorus-becomes-official-anthropic-reseller-in-japan?utm_source=openai\">en.sedaily.com</a>)  </li>\n<li><strong>Nomura Research Institute (NRI)</strong> (Mar 6 release) expanded its partnership with Anthropic Japan, describing (i) implementation support services for Japanese enterprises and (ii) internal Claude for Enterprise deployment to build workforce capability; NRI explicitly calls out support extending to <strong>Claude Code</strong> and evaluation of <strong>Claude Cowork</strong> research preview. (<a href=\"https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai\">nri.com</a>)</li>\n</ul>\n</li>\n<li>Nuance: the cluster of partner-led announcements suggests Anthropic is pushing <strong>regional enterprise penetration</strong> (language/security/regulatory localization) even as its US federal footprint is under acute pressure (see Theme 4). (<a href=\"https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai\">nri.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>xAI</strong></p>\n<ul>\n<li>xAI’s interim CRO <strong>Jon Shulkin</strong> publicly solicited interest in a <strong>free version of Grok Enterprise</strong> (targeting firms ≥50 employees). This is a classic distribution lever—using freemium to seed deployment footprints and create expansion paths. (<a href=\"https://www.twstalker.com/xiankun_xu\">twstalker.com</a>)</li>\n</ul>\n</li>\n</ul>\n<h3>Theme 3 — Safety &amp; evaluation gets more “instrumented,” but is increasingly entangled with capability shipping</h3>\n<ul>\n<li><p><strong>OpenAI</strong></p>\n<ul>\n<li>OpenAI published a safety research post <strong>Mar 5</strong> arguing current reasoning models show <strong>low chain-of-thought (CoT) controllability</strong> (i.e., they struggle to obey instructions that would deliberately reshape/obfuscate their reasoning), positioning this as supportive of <strong>CoT monitoring</strong> as a safeguard; they released <strong>CoT-Control</strong>, an open-source eval suite (~13k tasks). (<a href=\"https://openai.com/index/reasoning-models-chain-of-thought-controllability/?utm_source=openai\">openai.com</a>)  </li>\n<li>The <strong>GPT‑5.4 Thinking system card</strong> (Deployment Safety Hub) is explicitly referenced as the place where CoT monitorability/controllability and “Preparedness Framework” assessments are reported for this release line. (<a href=\"https://deploymentsafety.openai.com/gpt-5-4-thinking?utm_source=openai\">deploymentsafety.openai.com</a>)  </li>\n<li>OpenAI also states GPT‑5.4 is treated as <strong>“High cyber capability”</strong> under its Preparedness Framework and “deployed with corresponding protections.” (<a href=\"https://openai.com/index/introducing-gpt-5-4/\">openai.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>Google DeepMind</strong></p>\n<ul>\n<li>Gemini 3.1 Flash‑Lite’s <strong>model card</strong> is unusually explicit about evaluation categories (agentic tool use, long-context, factuality) and includes a “Frontier Safety Assessment” rationale by reference to Gemini 3.1 Pro assessments (i.e., downstream tiers inherit the frontier risk posture from the most capable family member). (<a href=\"https://deepmind.google/models/model-cards/gemini-3-1-flash-lite/?utm_source=openai\">deepmind.google</a>)</li>\n</ul>\n</li>\n<li><p><strong>Anthropic</strong></p>\n<ul>\n<li>The most material “safety signal” in-window is not a model card but the <strong>legal/procurement confrontation</strong>: Reuters reports Anthropic frames the dispute as retaliation for refusing to permit Claude use in <strong>mass surveillance of Americans</strong> and <strong>lethal autonomous warfare without human oversight</strong>. (<a href=\"https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai\">investing.com</a>)</li>\n</ul>\n</li>\n</ul>\n<h3>Theme 4 — Government constraints become first-order business variables (procurement bans, litigation, and competitor substitution)</h3>\n<ul>\n<li><p><strong>Anthropic</strong></p>\n<ul>\n<li>Reuters (Mar 9 explainer) reports Anthropic <strong>sued the US government</strong> and describes a conflict traceable to DoD negotiations: a demand to allow Claude for “all lawful uses,” with Anthropic refusing on surveillance and lethal autonomy grounds; Reuters also describes a Truth Social directive ordering agencies to cease using Anthropic tech and notes agencies cutting ties. (<a href=\"https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai\">investing.com</a>)  </li>\n<li>Bloomberg Law reports Treasury Secretary Scott Bessent saying Treasury is <strong>terminating all use of Anthropic products</strong> following presidential direction. (<a href=\"https://news.bloomberglaw.com/federal-contracting/anthropic-loses-all-us-treasury-contracts-bessent-says?utm_source=openai\">news.bloomberglaw.com</a>)  </li>\n<li>SCMP (Reuters-sourced) likewise reports Treasury ending use of Anthropic products. (<a href=\"https://www.scmp.com/news/world/united-states-canada/article/3345184/us-treasury-says-it-stopping-use-anthropics-tech-including-its-claude-platform?utm_source=openai\">scmp.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>OpenAI</strong></p>\n<ul>\n<li>The same procurement turbulence functionally becomes a distribution opening for OpenAI (e.g., agencies switching providers is reported in Reuters-syndicated coverage), but the more concrete in-window signals are the <strong>internal/talent reactions</strong> (next theme). (<a href=\"https://www.investing.com/news/stock-market-news/openai-robotics-head-resigns-after-deal-with-pentagon-4548539?utm_source=openai\">investing.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>xAI / Meta</strong></p>\n<ul>\n<li>No major in-window US procurement shift surfaced in this scan for xAI/Meta; however, enterprise/federal interest remains a background competitive arena given prior reporting about Grok availability to agencies (outside this 8‑day window). (<a href=\"https://www.investing.com/news/stock-market-news/musks-xai-to-provide-grok-chatbot-to-us-federal-agencies-4255904?utm_source=openai\">investing.com</a>)</li>\n</ul>\n</li>\n</ul>\n<h3>Theme 5 — Talent + capital markets: “IPO gravity” and defense posture appear to move people</h3>\n<ul>\n<li><p><strong>OpenAI</strong></p>\n<ul>\n<li>Reuters reports <strong>Caitlin Kalinowski</strong> (head of robotics and consumer hardware) resigned <strong>Mar 7</strong>, citing concerns about OpenAI’s DoD agreement. This is a <em>senior, mission-adjacent</em> exit (robotics/hardware + national security), likely to be interpreted internally as a governance red-line event rather than routine churn. (<a href=\"https://www.investing.com/news/stock-market-news/openai-robotics-head-resigns-after-deal-with-pentagon-4548539?utm_source=openai\">investing.com</a>)  </li>\n<li>OpenAI’s rapid release cadence also included explicit lifecycle management: GPT‑5.2 Thinking remains for ~3 months and is scheduled to retire <strong>Jun 5, 2026</strong> (a concrete timeline signal to enterprise developers maintaining legacy behaviors). (<a href=\"https://openai.com/index/introducing-gpt-5-4/\">openai.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>Anthropic (talent inbound signal, but sourcing is mostly secondary in this scan)</strong></p>\n<ul>\n<li>Multiple outlets report OpenAI VP/research leader <strong>Max Schwarzer</strong> announced on X that he is leaving OpenAI to join Anthropic to return to hands-on RL research (OpenAI/Anthropic have not been pulled here as primary confirmations). Treat as <strong>reported / not independently verified</strong> in this briefing. (<a href=\"https://finance.sina.com.cn/wm/2026-03-05/doc-inhpwzrx3338020.shtml?utm_source=openai\">finance.sina.com.cn</a>)</li>\n</ul>\n</li>\n<li><p><strong>Cross-lab / capital markets</strong></p>\n<ul>\n<li>Nvidia CEO <strong>Jensen Huang</strong> stated Nvidia’s recent investments in OpenAI and Anthropic are “likely” its last in both, explaining that expected IPOs would close the opportunity to invest—an explicit public linkage between <strong>frontier lab financing pathways and near-term liquidity expectations</strong>. (<a href=\"https://techcrunch.com/2026/03/04/jensen-huang-says-nvidia-is-pulling-back-from-openai-and-anthropic-but-his-explanation-raises-more-questions-than-it-answers/?utm_source=openai\">techcrunch.com</a>)</li>\n</ul>\n</li>\n</ul>\n<h3>Theme 6 — Research engagements (external proof points of “lab models in the wild”)</h3>\n<ul>\n<li><strong>OpenAI + Anthropic (model usage, not corporate announcements)</strong><ul>\n<li>An arXiv proof-of-concept in experimental particle physics (submitted <strong>Mar 5</strong>) reports an analysis and note-writing workflow carried out “entirely by AI agents” using <strong>OpenAI Codex and Anthropic Claude</strong> under expert direction—useful as an external signal of where agent tooling is already being operationalized (scientific pipelines). (<a href=\"https://arxiv.org/abs/2603.05735?utm_source=openai\">arxiv.org</a>)</li>\n</ul>\n</li>\n</ul>\n<h2>Expert opinion and analysis (selected)</h2>\n<ul>\n<li><p><strong>Reuters (Mar 9) — Anthropic’s lawsuit narrative and the “all lawful uses” demand</strong></p>\n<ul>\n<li>Scope: detailed chronology + Anthropic’s framing of the dispute as coercion/retaliation over safety limits (surveillance + lethal autonomy), plus downstream procurement consequences.  </li>\n<li>Why it matters: turns “model policy” into a litigated contract boundary; sets precedent risk for all frontier labs selling to state actors. (<a href=\"https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai\">investing.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>TechCrunch (Mar 4) — Nvidia’s posture: public-market trajectory and strategic distancing</strong></p>\n<ul>\n<li>Scope: Huang’s comments at an investor conference, framing OpenAI/Anthropic stakes as effectively “final” pre-IPO opportunities; interpreted as both a capital markets signal and an ecosystem-shaping statement (Nvidia as kingmaker stepping back from incremental equity exposure). (<a href=\"https://techcrunch.com/2026/03/04/jensen-huang-says-nvidia-is-pulling-back-from-openai-and-anthropic-but-his-explanation-raises-more-questions-than-it-answers/?utm_source=openai\">techcrunch.com</a>)</li>\n</ul>\n</li>\n<li><p><strong>FAIR/Meta + NYU (Mar 3) — From-scratch multimodal scaling laws and MoE as a harmonizer</strong></p>\n<ul>\n<li>Scope: controlled multimodal pretraining experiments; key argument is not “multimodal is good” but <em>how to scale it</em>: vision is more data-hungry; MoE narrows scaling asymmetry; “world modeling” appears with minimal domain data.  </li>\n<li>Why it matters: gives technical justification for shifting frontier investment from text-only scaling to <strong>video-heavy corpora + sparse multimodal architectures</strong>. (<a href=\"https://arxiv.org/abs/2603.03276\">arxiv.org</a>)</li>\n</ul>\n</li>\n</ul>\n",
  "body_markdown": "## Tue Mar 3 to Tue Mar 10, 2026 (inclusive)  \n~1,550 words\n\n## Executive synthesis  \nAcross the cycle, the frontier labs split into two visible “centers of gravity”: (1) **agentic enterprise execution** (OpenAI shipping GPT‑5.4’s native computer-use + tool-search, then immediately packaging it into Excel and an AppSec agent) and (2) **cost-optimized scale + multimodal pretraining** (Google pushing a new low-price Gemini tier; Meta/FAIR publishing from-scratch multimodal scaling results). Overlaid on both is a sharpened **state/procurement constraint layer**: Anthropic’s dispute with the US government escalated into litigation and widespread agency offboarding, while OpenAI’s defense engagement triggered a high-salience senior resignation—turning “safety posture” from abstract governance into near-term distribution and talent outcomes. ([openai.com](https://openai.com/index/introducing-gpt-5-4/?utm_source=openai))\n\n## Information (The Core)\n\n### Theme 1 — Agents move from “demo” to “operational surface area” (computer use, tool ecosystems, and workflow closure)\n\n- **OpenAI**\n  - **GPT‑5.4** launched **Mar 5** across ChatGPT (as “GPT‑5.4 Thinking”), API, and Codex; OpenAI positions it explicitly as a **professional-work** frontier model combining reasoning + coding + agentic tool use. ([openai.com](https://openai.com/index/introducing-gpt-5-4/?utm_source=openai))  \n  - The material capability shift is **native computer use** (desktop/app control via screenshots + mouse/keyboard actions) plus **1M-token context** support (notably framed as enabling longer-horizon agents). ([openai.com](https://openai.com/index/introducing-gpt-5-4/?utm_source=openai))  \n  - **Tool search** is introduced to avoid front-loading large tool definitions into every prompt—explicitly targeting “large ecosystems of tools/connectors” and MCP-style tool catalogs; OpenAI reports a **47% token reduction** on a Scale MCP Atlas benchmark with tool search vs. exposing all tools directly. ([openai.com](https://openai.com/bn-BD/index/introducing-gpt-5-4/?utm_source=openai))  \n  - **Codex Security** (research preview) shipped **Mar 6** as an “application security agent,” emphasizing deep repo context + automated validation to reduce false positives; rollout targets **Pro/Enterprise/Business/Edu** with **free usage for the next month** (time-boxed adoption push). ([openai.com](https://openai.com/index/codex-security-now-in-research-preview/?utm_source=openai))  \n\n- **Google DeepMind / Google**\n  - **Gemini 3.1 Flash‑Lite** announced **Mar 3** as a preview tier for “highest-volume workloads,” stressing latency + price/performance rather than peak capability; rolled out via **Gemini API (AI Studio)** and **Vertex AI**. ([blog.google](https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/?utm_source=openai))  \n  - The positioning implies a strategic bet that **agentic volume economics** (serving many tool calls / classifications / translations) becomes a primary competitive axis, not just “best model.” ([blog.google](https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/?utm_source=openai))  \n\n- **Anthropic (via partners, not first-party product posts in this scan)**\n  - Japan-channel partner releases repeatedly highlight “**Claude Code**” and a desktop agent “**Claude Cowork**” as a **research preview** being evaluated inside enterprises (notably NRI). This reads as a parallel “agents on the desktop” track—surfacing through integrators/resellers rather than a marquee global launch in this window. ([nri.com](https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai))  \n\n- **Meta AI (FAIR)**\n  - FAIR+NYU’s **“Beyond Language Modeling”** (submitted **Mar 3**) is research-level reinforcement for agentic multimodal direction: action-conditioned video + world-modeling behaviors are treated as emergent properties of broad multimodal pretraining (vs. narrow robotics-only datasets). ([arxiv.org](https://arxiv.org/abs/2603.03276))  \n\n### Theme 2 — Enterprise distribution “hardens”: integrations, resellers, and packaging become the battleground\n\n- **OpenAI**\n  - **ChatGPT for Excel (beta)** announced **Mar 5**: an Excel add-in embedding ChatGPT directly into spreadsheets, explicitly “powered by GPT‑5.4.” ([openai.com](https://openai.com/index/chatgpt-for-excel/?utm_source=openai))  \n  - OpenAI also added **financial data integrations inside ChatGPT** (FactSet, Dow Jones Factiva, LSEG, Daloopa, S&P Global, etc.), signaling a strategy of winning regulated/high-stakes workflows by attaching to **trusted data rails** rather than relying on model output alone. ([openai.com](https://openai.com/index/chatgpt-for-excel/?utm_source=openai))  \n\n- **Anthropic**\n  - **Japan enterprise channel expansion** shows a reseller/integrator scale strategy:\n    - **Classmethod** (Mar 2) announced an authorized reseller agreement (Amazon Bedrock channel), bundling licensing + consulting/implementation. ([classmethod.jp](https://classmethod.jp/english/news/260302-anthropic/?utm_source=openai))  \n    - **NHN Techorus** (Mar 5) similarly announced Anthropic reseller status for Claude via Bedrock, emphasizing enterprise deployment support. ([en.sedaily.com](https://en.sedaily.com/technology/2026/03/05/nhn-techorus-becomes-official-anthropic-reseller-in-japan?utm_source=openai))  \n    - **Nomura Research Institute (NRI)** (Mar 6 release) expanded its partnership with Anthropic Japan, describing (i) implementation support services for Japanese enterprises and (ii) internal Claude for Enterprise deployment to build workforce capability; NRI explicitly calls out support extending to **Claude Code** and evaluation of **Claude Cowork** research preview. ([nri.com](https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai))  \n  - Nuance: the cluster of partner-led announcements suggests Anthropic is pushing **regional enterprise penetration** (language/security/regulatory localization) even as its US federal footprint is under acute pressure (see Theme 4). ([nri.com](https://www.nri.com/en/news/newsrelease/files/000058533.pdf?utm_source=openai))  \n\n- **xAI**\n  - xAI’s interim CRO **Jon Shulkin** publicly solicited interest in a **free version of Grok Enterprise** (targeting firms ≥50 employees). This is a classic distribution lever—using freemium to seed deployment footprints and create expansion paths. ([twstalker.com](https://www.twstalker.com/xiankun_xu))  \n\n### Theme 3 — Safety & evaluation gets more “instrumented,” but is increasingly entangled with capability shipping\n\n- **OpenAI**\n  - OpenAI published a safety research post **Mar 5** arguing current reasoning models show **low chain-of-thought (CoT) controllability** (i.e., they struggle to obey instructions that would deliberately reshape/obfuscate their reasoning), positioning this as supportive of **CoT monitoring** as a safeguard; they released **CoT-Control**, an open-source eval suite (~13k tasks). ([openai.com](https://openai.com/index/reasoning-models-chain-of-thought-controllability/?utm_source=openai))  \n  - The **GPT‑5.4 Thinking system card** (Deployment Safety Hub) is explicitly referenced as the place where CoT monitorability/controllability and “Preparedness Framework” assessments are reported for this release line. ([deploymentsafety.openai.com](https://deploymentsafety.openai.com/gpt-5-4-thinking?utm_source=openai))  \n  - OpenAI also states GPT‑5.4 is treated as **“High cyber capability”** under its Preparedness Framework and “deployed with corresponding protections.” ([openai.com](https://openai.com/index/introducing-gpt-5-4/))  \n\n- **Google DeepMind**\n  - Gemini 3.1 Flash‑Lite’s **model card** is unusually explicit about evaluation categories (agentic tool use, long-context, factuality) and includes a “Frontier Safety Assessment” rationale by reference to Gemini 3.1 Pro assessments (i.e., downstream tiers inherit the frontier risk posture from the most capable family member). ([deepmind.google](https://deepmind.google/models/model-cards/gemini-3-1-flash-lite/?utm_source=openai))  \n\n- **Anthropic**\n  - The most material “safety signal” in-window is not a model card but the **legal/procurement confrontation**: Reuters reports Anthropic frames the dispute as retaliation for refusing to permit Claude use in **mass surveillance of Americans** and **lethal autonomous warfare without human oversight**. ([investing.com](https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai))  \n\n### Theme 4 — Government constraints become first-order business variables (procurement bans, litigation, and competitor substitution)\n\n- **Anthropic**\n  - Reuters (Mar 9 explainer) reports Anthropic **sued the US government** and describes a conflict traceable to DoD negotiations: a demand to allow Claude for “all lawful uses,” with Anthropic refusing on surveillance and lethal autonomy grounds; Reuters also describes a Truth Social directive ordering agencies to cease using Anthropic tech and notes agencies cutting ties. ([investing.com](https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai))  \n  - Bloomberg Law reports Treasury Secretary Scott Bessent saying Treasury is **terminating all use of Anthropic products** following presidential direction. ([news.bloomberglaw.com](https://news.bloomberglaw.com/federal-contracting/anthropic-loses-all-us-treasury-contracts-bessent-says?utm_source=openai))  \n  - SCMP (Reuters-sourced) likewise reports Treasury ending use of Anthropic products. ([scmp.com](https://www.scmp.com/news/world/united-states-canada/article/3345184/us-treasury-says-it-stopping-use-anthropics-tech-including-its-claude-platform?utm_source=openai))  \n\n- **OpenAI**\n  - The same procurement turbulence functionally becomes a distribution opening for OpenAI (e.g., agencies switching providers is reported in Reuters-syndicated coverage), but the more concrete in-window signals are the **internal/talent reactions** (next theme). ([investing.com](https://www.investing.com/news/stock-market-news/openai-robotics-head-resigns-after-deal-with-pentagon-4548539?utm_source=openai))  \n\n- **xAI / Meta**\n  - No major in-window US procurement shift surfaced in this scan for xAI/Meta; however, enterprise/federal interest remains a background competitive arena given prior reporting about Grok availability to agencies (outside this 8‑day window). ([investing.com](https://www.investing.com/news/stock-market-news/musks-xai-to-provide-grok-chatbot-to-us-federal-agencies-4255904?utm_source=openai))  \n\n### Theme 5 — Talent + capital markets: “IPO gravity” and defense posture appear to move people\n\n- **OpenAI**\n  - Reuters reports **Caitlin Kalinowski** (head of robotics and consumer hardware) resigned **Mar 7**, citing concerns about OpenAI’s DoD agreement. This is a *senior, mission-adjacent* exit (robotics/hardware + national security), likely to be interpreted internally as a governance red-line event rather than routine churn. ([investing.com](https://www.investing.com/news/stock-market-news/openai-robotics-head-resigns-after-deal-with-pentagon-4548539?utm_source=openai))  \n  - OpenAI’s rapid release cadence also included explicit lifecycle management: GPT‑5.2 Thinking remains for ~3 months and is scheduled to retire **Jun 5, 2026** (a concrete timeline signal to enterprise developers maintaining legacy behaviors). ([openai.com](https://openai.com/index/introducing-gpt-5-4/))  \n\n- **Anthropic (talent inbound signal, but sourcing is mostly secondary in this scan)**\n  - Multiple outlets report OpenAI VP/research leader **Max Schwarzer** announced on X that he is leaving OpenAI to join Anthropic to return to hands-on RL research (OpenAI/Anthropic have not been pulled here as primary confirmations). Treat as **reported / not independently verified** in this briefing. ([finance.sina.com.cn](https://finance.sina.com.cn/wm/2026-03-05/doc-inhpwzrx3338020.shtml?utm_source=openai))  \n\n- **Cross-lab / capital markets**\n  - Nvidia CEO **Jensen Huang** stated Nvidia’s recent investments in OpenAI and Anthropic are “likely” its last in both, explaining that expected IPOs would close the opportunity to invest—an explicit public linkage between **frontier lab financing pathways and near-term liquidity expectations**. ([techcrunch.com](https://techcrunch.com/2026/03/04/jensen-huang-says-nvidia-is-pulling-back-from-openai-and-anthropic-but-his-explanation-raises-more-questions-than-it-answers/?utm_source=openai))  \n\n### Theme 6 — Research engagements (external proof points of “lab models in the wild”)\n\n- **OpenAI + Anthropic (model usage, not corporate announcements)**\n  - An arXiv proof-of-concept in experimental particle physics (submitted **Mar 5**) reports an analysis and note-writing workflow carried out “entirely by AI agents” using **OpenAI Codex and Anthropic Claude** under expert direction—useful as an external signal of where agent tooling is already being operationalized (scientific pipelines). ([arxiv.org](https://arxiv.org/abs/2603.05735?utm_source=openai))  \n\n## Expert opinion and analysis (selected)\n\n- **Reuters (Mar 9) — Anthropic’s lawsuit narrative and the “all lawful uses” demand**\n  - Scope: detailed chronology + Anthropic’s framing of the dispute as coercion/retaliation over safety limits (surveillance + lethal autonomy), plus downstream procurement consequences.  \n  - Why it matters: turns “model policy” into a litigated contract boundary; sets precedent risk for all frontier labs selling to state actors. ([investing.com](https://www.investing.com/news/stock-market-news/explaineranthropics-case-against-the-government-what-the-ai-company-says-happened-4550592?utm_source=openai))  \n\n- **TechCrunch (Mar 4) — Nvidia’s posture: public-market trajectory and strategic distancing**\n  - Scope: Huang’s comments at an investor conference, framing OpenAI/Anthropic stakes as effectively “final” pre-IPO opportunities; interpreted as both a capital markets signal and an ecosystem-shaping statement (Nvidia as kingmaker stepping back from incremental equity exposure). ([techcrunch.com](https://techcrunch.com/2026/03/04/jensen-huang-says-nvidia-is-pulling-back-from-openai-and-anthropic-but-his-explanation-raises-more-questions-than-it-answers/?utm_source=openai))  \n\n- **FAIR/Meta + NYU (Mar 3) — From-scratch multimodal scaling laws and MoE as a harmonizer**\n  - Scope: controlled multimodal pretraining experiments; key argument is not “multimodal is good” but *how to scale it*: vision is more data-hungry; MoE narrows scaling asymmetry; “world modeling” appears with minimal domain data.  \n  - Why it matters: gives technical justification for shifting frontier investment from text-only scaling to **video-heavy corpora + sparse multimodal architectures**. ([arxiv.org](https://arxiv.org/abs/2603.03276))",
  "sources": [
    {
      "label": "Legacy public URL",
      "url": "https://05802.github.io/news/202603100417_frontier_labs/"
    },
    {
      "label": "Legacy source markdown",
      "url": "https://raw.githubusercontent.com/05802/05802.github.io/master/_roll/2026-03-10-0417-frontier_labs.md"
    }
  ],
  "content_prefix": "entries/roll/frontier-labs/2026/03/202603100417_frontier_labs/",
  "assets_prefix": "entries/roll/frontier-labs/2026/03/202603100417_frontier_labs/assets/",
  "assets_base_url": "https://stations.work/content/entries/roll/frontier-labs/2026/03/202603100417_frontier_labs/assets/",
  "canonical_url": "https://stations.work/roll/202603100417_frontier_labs"
}