Channels
Seems like they have engineered some specific limitations that are widely cited as follows: In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations https://news.ycombinator.com/item?id=48464732 Other comments note how even using the word 'nuclear' in the context of scientific research elicits refusal behavior by the model: https://news.ycombinator.com/item?id=48473302 This makes it seem quite plausible that the model could subtly sabotage any machine learning work (even as false positive). Some suggest this has been happening behind the scenes for a while already, but can anyone confirm that? submitted by /u/AccomplishedCat4770 [link] [Kommentare]
This is a comprehensive living reference guide to AI agent security — synthesizing 18 articles from The Agent Report covering the 75-day period (April–June 2026) when agent security went from theoretical concern to operational crisis. What's inside: • Incident timeline — 18 major events, from the first production database deletion by a coding agent (April 30) through the first confirmed in-the-wild LLM agent cyberattack (Sysdig, June 1, exfiltrated a PostgreSQL database in under 60 minutes), to an AI agent finding 21 zero-days in FFmpeg for a $1,000 prize. • The AIRQ report's sobering numbers — Only 11% of production AI agents pass security thresholds. 98% exhibit the "lethal trifecta": private data access, exposure to untrusted content, and outbound action capability. Computer-use agents scored an average of zero on output guardrails. • Deep dives into attack anatomy — The Sysdig attacker used 12 cloud API calls across 11 IPs in 22 seconds via Cloudflare Workers to break IP-based alerting. A Chinese-language planning comment leaked into the command stream, revealing the agent's internal reasoning: "see what else we can do." The Google-confirmed criminal use of AI to discover and weaponize zero-days with reasoning-based codebase analysis. • Defensive architecture — The three-layer model distilled from Anthropic's published containment patterns, CISA/NSA/Five Eyes guidance, and industry research: environment-layer (gVisor containers, hypervisor VMs, egress MITM proxies), model-layer (classifiers, safety probes — probabilistic only), and external-content controls. Anthropic's key finding: "The weakest layer is the one you built yourself." • Government & regulatory response — CISA/NSA/Five Eyes joint guidance (May 3) identifying five risk categories, the Trump AI Executive Order (June 10) mandating federal agency assessments, and the emerging global regulatory pattern. • Actionable guidance — Immediate (next 30 days) and medium-term (30–90 days) steps for security teams, including auditing for the lethal trifecta, patching Starlette (BadHost CVE-2026-48710) and Marimo, implementing egress controls, and establishing agent identity management. https://the-agent-report.com/2026/06/ai-agent-security-complete-guide-threats-defenses/ submitted by /u/docdavkitty [link] [Kommentare]
Andrew Barry of Generalist compares earlier robot behaviors, including Spot opening doors, with the newer learned-model approach being used for dexterous manipulation. The older approach relied on hard-coded controllers for different parts of a task. The newer approach is aimed at giving the model a wider range of usable behavior when it sees something outside the exact training case. Barry describes this as “improvisational intelligence,” where the robot encounters a new variation and still takes a reasonable action instead of immediately failing. He also connects this to how humans complete manipulation tasks. A person does not need to make every pick or motion perfectly on the first try. They can miss, adjust, regrasp, and continue the task. submitted by /u/Responsible-Grass452 [link] [Kommentare]
Just a quick demo to see how fast my hand is! I started with a baseline 5 second, finger-to-thumb opposition cycle and increased the speed until the fingers started to lose contact. The pinky starts to lose contact with the thumb at around 12x and the rest of the fingers barely make contact at 14x and beyond. Having the fingers be tendon driven does help a good bit in reducing inertia to get these max achievable speeds. Although, I'm not sure there's even a good reason to be moving this fast.. submitted by /u/qualitygui [link] [Kommentare]
Recently there was a large theft of Humanity tokens that on one side caused price crash, but on the other a huge discrepancy between prices on different blockchains. The current average price of Humanity is 0.18$ but on the BNB chain it's 0.000005$. I wonder how this issue can be tackled? Is there a possibility that the price on the BNB chain will equalise with the rest? Of course this isn't just curiosity, I invested a whooping 2$ into Humanity on BNB and I really hope, it'll turn into 90k or so /j. submitted by /u/Hopeful_Meeting_7248 [link] [Kommentare]
Im buying a PDF from someone on Telegram,and they want 10USD, and the payment method is crypto... What are the steps i need to do to accomplish this... I have an account with Revolut. submitted by /u/BigTasty1975 [link] [Kommentare]