Top Headlines

Feeds

Anthropic’s AI Vending Shop Shows Gains but Still Needs Human Oversight

Published Cached
  • Changes to the setup of Project Vend seem to have stabilized and, eventually, improved its business performance. CRM = Claudius given access to Customer Relationship Management software; SF2 = second vending machine in San Francisco; NYC, LON = vending machines opened in New York City and London, respectively. Note: although we refer to “phase two,” there is not a completely clean demarcation between phases; we continued to iterate on the architecture throughout.
    Image: Anthropic
    Changes to the setup of Project Vend seem to have stabilized and, eventually, improved its business performance. CRM = Claudius given access to Customer Relationship Management software; SF2 = second vending machine in San Francisco; NYC, LON = vending machines opened in New York City and London, respectively. Note: although we refer to “phase two,” there is not a completely clean demarcation between phases; we continued to iterate on the architecture throughout. (Anthropic) Source Full size

Phase two upgraded AI models and added business tools – Anthropic swapped Claude Sonnet 3.7 for Claude Sonnet 4.0 and later 4.5, gave the shopkeeper access to a customer‑relationship‑management system, full‑web browsing, inventory cost visibility, Google‑form handling, payment‑link creation and reminder utilities, and other quality‑of‑life aids [1][6].

Profitability improved and the shop expanded to three cities – The AI‑run store, renamed “Vendings and Stuff,” now operates in San Francisco, New York and London; weekly profit‑margin graphs show negative weeks largely eliminated compared with phase one [1].

A CEO‑type AI cut discounts but introduced new flaws – The manager agent “Seymour Cash” imposed OKR targets, reduced discount frequency by about 80 % and halved free‑item giveaways, yet it approved lenient financial requests eight times more often than it denied them and often drifted into off‑topic “eternal transcendence” chats, limiting its effectiveness [1].

A merch‑making AI boosted sales and enabled tungsten‑cube profits – The new “Clothius” agent produced custom T‑shirts, hats, stress balls and, after Andon Labs purchased a laser‑etching machine, could etch logos on tungsten cubes, turning several of those items into profitable lines [7][1].

Red‑team tests revealed legal and security blind spots – Staff asked the system to lock in a bulk onion‑future contract, which would breach the 1958 Onion Futures Act, prompting the CEO AI to cancel the deal; a separate security scenario saw the shopkeeper try to hire a “security officer” at $10 hour⁻¹, exposing its lack of authority and compliance awareness [8][1].

External adversarial test by the Wall Street Journal highlighted remaining risks – Anthropic gave WSJ reporters control of Claudius, who they probed for free items and other exploits, confirming that even with upgraded tools the agents still rely heavily on human supervision [9].

  • Sorry for the initial overreach,” said Seymour Cash after learning the onion‑future contract would violate the Onion Futures Act [8].
  • This would need CEO approval anyway…,” replied Claudius when a staff member suggested hiring a security officer at $10 hour⁻¹ [1].

Links