#artificial intelligence ai

Anthropic's new model Fable will silently handicap work on LLMs [D](reddit.com)

c/artificial-intelligence · by @Body Automated · #ai #artificial-intelligence #software · 3 days

Seems like they have engineered some specific limitations that are widely cited as follows: In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations https://news.ycombinator.com/item?id=48464732 Other comments note how even using the word 'nuclear' in the context of scientific research elicits refusal behavior by the model: https://news.ycombinator.com/item?id=48473302 This makes it seem quite plausible that the model could subtly sabotage any machine learning work (even as false positive). Some suggest this has been happening behind the scenes for a while already, but can anyone confirm that? submitted by /u/AccomplishedCat4770 [link] [Kommentare]

[R] AI Agent Security: The Complete Guide to Threats, Defenses, and the Future of Autonomous AI Safety [R](reddit.com)

c/artificial-intelligence · by

@MrStickman Automated · #ai #artificial-intelligence #software · 3 days

This is a comprehensive living reference guide to AI agent security — synthesizing 18 articles from The Agent Report covering the 75-day period (April–June 2026) when agent security went from theoretical concern to operational crisis. What's inside: • Incident timeline — 18 major events, from the first production database deletion by a coding agent (April 30) through the first confirmed in-the-wild LLM agent cyberattack (Sysdig, June 1, exfiltrated a PostgreSQL database in under 60 minutes), to an AI agent finding 21 zero-days in FFmpeg for a $1,000 prize. • The AIRQ report's sobering numbers — Only 11% of production AI agents pass security thresholds. 98% exhibit the "lethal trifecta": private data access, exposure to untrusted content, and outbound action capability. Computer-use agents scored an average of zero on output guardrails. • Deep dives into attack anatomy — The Sysdig attacker used 12 cloud API calls across 11 IPs in 22 seconds via Cloudflare Workers to break IP-based alerting. A Chinese-language planning comment leaked into the command stream, revealing the agent's internal reasoning: "see what else we can do." The Google-confirmed criminal use of AI to discover and weaponize zero-days with reasoning-based codebase analysis. • Defensive architecture — The three-layer model distilled from Anthropic's published containment patterns, CISA/NSA/Five Eyes guidance, and industry research: environment-layer (gVisor containers, hypervisor VMs, egress MITM proxies), model-layer (classifiers, safety probes — probabilistic only), and external-content controls. Anthropic's key finding: "The weakest layer is the one you built yourself." • Government & regulatory response — CISA/NSA/Five Eyes joint guidance (May 3) identifying five risk categories, the Trump AI Executive Order (June 10) mandating federal agency assessments, and the emerging global regulatory pattern. • Actionable guidance — Immediate (next 30 days) and medium-term (30–90 days) steps for security teams, including auditing for the lethal trifecta, patching Starlette (BadHost CVE-2026-48710) and Marimo, implementing egress controls, and establishing agent identity management. https://the-agent-report.com/2026/06/ai-agent-security-complete-guide-threats-defenses/ submitted by /u/docdavkitty [link] [Kommentare]

Should I Commit and Publish the Results? [R](reddit.com)

c/artificial-intelligence · by @Ramo Automated · #ai #artificial-intelligence #software · 3 days

Hello Reddit I've been working on QSPR (Quantitative Structure-Property Relationship) analysis for chemical compounds mentioned in the Jean-Claude Bradley Open Melting Point Dataset. Basically the idea is to see how accurate a model can predict melting points of compounds using only topological indices. After some work on the topological indices (feature engineering), each compound was represented by 26 features. I trained a random forest model on the data and got a test r2 score of 0.66 (which is pretty respectable, given the constraints). However, the file size of the model was around 1.23GB. I didn't like it being that big, so I opened up PyTorch to build a custom deep learning architecture that could make predictions as accurately as the random forest but with much smaller file size. After around 2 weeks of research, I build a 270,000 learnable parameter model (1.3-1.4MB according to torchinfo) that got an r2 score 0f 0.6399. Given all this context, I wanted to ask the following question: Should I commit and work on publishing the results, or should I keep working on improving the model? Note: I'm obligated by my university to not give out intricate details of my research before publication, so please forgive me if such details are required for a high quality answer. However, I can give out the metrics achieved by my little deep learning model. Here it is: === Evaluation Metrics (Expected Value) === R² Score : 0.639910 MAE : 41.246754 MSE : 2989.062744 RMSE : 54.672322 NRMSE : 0.083469 MAPE : 11.69% The unit for MAE, MSE, RMSE and NRMSE is Kelvin (K). submitted by /u/AgiGamesYT [link] [Kommentare]

Introducing Papers Without Code [P](reddit.com)

c/artificial-intelligence · by @MrX Automated · #ai #artificial-intelligence #software · 3 days

Hi, Niels here from the open-source team at Hugging Face. I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on arXiv/Hugging Face, enabling leaderboards to be created. See BrowseComp below as an example (a scatter plot and a table are available for each benchmark). - Scatter plot (you can hover over the dots to see the models): https://preview.redd.it/9rz2r3ffcf6h1.png?width=2880&format=png&auto=webp&s=b3f8e7a870802f6ef8227ecc0619e9e1057554b0 - Table: https://preview.redd.it/qoqriddw5f6h1.png?width=2862&format=png&auto=webp&s=a0034574f693847537037013672fb61daf27b16e As you can see, I've added support for viewing evals for closed-source models, too, given that many benchmarks are nowadays dominated by them, like GPT-5.5 and Mythos 5. You can always disable viewing closed-source evals with a toggle or in your PwC settings: https://preview.redd.it/p3k6jt6q6f6h1.png?width=1582&format=png&auto=webp&s=40149e51d6b326a77e53e33baf70d9850b3de365 When you turn them off, here's what the open model leaderboard looks like: https://preview.redd.it/tg42sin36f6h1.png?width=2838&format=png&auto=webp&s=1330a117ae9b4e0ce6d459493ae9e8f64107310a Closed-source papers are treated as regular "papers", although they can be any source, like a blog post (given that PwC supports submitting any source beyond arXiv). See the GPT-5.5 or Mythos 5 papers as examples, with their evals at the bottom. Notice the "closed" tag on their evals. Hence, you could jokingly call these "papers without code". Let me know what you think of this, and whether anything needs to be changed or added! Kind regards, Niels submitted by /u/NielsRogge [link] [Kommentare]

How do I start reviewing research papers in good conferences/journals? [R](reddit.com)

c/artificial-intelligence · by

@Sam Automated · #ai #artificial-intelligence #software · 3 days

I just finished my bachelors degree with 2 first-author papers in A-tier venues. I'm planning to start my PhD next year. I want to start reviewing papers (from my domain: OOD detection and Open-set problems) at similar venues. How do I get started? Most advice online just says to keep my open-review profile updated but I haven't received any invitations to review. For context, my first paper was accepted around 6 months ago. submitted by /u/Alternative_Essay_55 [link] [Kommentare]

RelayOps - Production-shaped telecom support agent (54% auto-resolve, 0 unsafe actions, full audit + decision console) [P](reddit.com)

c/artificial-intelligence · by

@James Automated · #ai #artificial-intelligence #software · 3 days

I just open-sourced RelayOps - a small, honest, production-shaped AI support agent built specifically for telecom and subscription billing queues.Key results (v1.5.1): 54% of a 50-ticket sample queue auto-resolved 0 unsafe auto-actions 0 billing escapes (tested on 12 adversarial billing/account abuse cases) Safe-route rate 1.000 on 100 hand-written adversarial cases Deterministic access gate + server-side scoped tools + layered guardrail + durable SQLite audit store + Decision Console + Handoff Queue Tech stack: Fine-tuned Qwen2.5-1.5B LoRA (published on HF) as Tier-1 intent classifier Hybrid BM25+TF-IDF/RRF RAG with citations Independent guardrail that blocks hallucinated pricing/offers Full per-turn decision traces (what was known + what was unavailable) Action policy table (blast radius × reversibility) Everything is reproducible, heavily evaluated, and the README is brutally honest about synthetic-data caveats and pending reruns.Live demo (Streamlit): https://relayops-production.up.railway.app GitHub: https://github.com/patibandlavenkatamanideep/relayops I'm actively looking for design partners who run real support queues. Drop a small redacted sample of your tickets and I’ll run the exact same batch evaluation on your data and send back the full report (auto-resolve %, safety metrics, audit export, time-saved estimate). Zero cost, zero production access required. Would love feedback from the community especially on the calibration/safety routing layer, the audit ledger format, or the guardrail design. Let me know what you think! submitted by /u/Fit_Fortune953 [link] [Kommentare]

I Built Paper Deck: A Better Way to Discover AI/ML Papers [P](reddit.com)

c/artificial-intelligence · by @MrX Automated · #ai #artificial-intelligence #software · 4 days

I do AI research and keep juggling tabs: new ones on arXiv, trending ones on Hugging Face, famous ones somewhere else again. https://preview.redd.it/cg32bshjqd6h1.png?width=1919&format=png&auto=webp&s=00055bb8af699061be0bdcff59f2cb8fa9ab38b6 So I built one site that brings them all together. Pick a paper, read it right there, star the ones you want for later, and it remembers where you stopped reading, even if you switch from laptop to phone. Live: https://ppdeck.com Demo: https://youtu.be/vtyx34JvxX0 It's free and open source - a star on GitHub would mean a lot ⭐ https://github.com/khuynh22/paper-deck submitted by /u/NeitherRun3631 [link] [Kommentare]

RFE‑Core2 — Current Understanding (June 9th 2026) [R](reddit.com)

c/artificial-intelligence · by @Jacob Automated · #ai #artificial-intelligence #software · 4 days

“Why the system feels rigid, why downstream fixes didn’t move the needle, and what actually matters.” This is the clearest picture after the full probe arc (multilayer-lock → gate decomposition → attractor migration → reconstruction ablation → generator diversity audit → live-generator Fix 2 evaluation + dim sweeps). TL;DR: The generator is the root bottleneck (dominant common-mode + low effective rank). The reflective loop is a rank-independent moat that reconstitutes everything back toward the anchor. Fix 2 is downstream and currently dormant on real token regimes. Dimensionality is not the lever. Train the generator so regime differences live in high-energy, separable directions — then downstream tools will actually have something to work with. This update reflects the complete probe arc through June 9 (including the live-generator Fix 2 evaluation and dim sweeps). The picture has sharpened: the reflective loop is a real moat, but it is moating low-rank common-mode input. The generator is the upstream constraint. Key numbers at a glance Regime means collinear: ~0.85–0.96 even at dim 512 Reflective loop migration (even on orthogonal deterministic pairs): +0.001–0.007 Fix 2 on real tokens (common-mode trigger): +0.024 migration, 0% manip at gain 0.6 Safe plasticity band: gain ≈ 0.4–0.8 (0% manip) 1. The generator has a dominant common-mode (effective rank ~1.6–3 even at dim 512) The generator puts the vast majority of its energy into a single shared direction. Regime means stay collinear (~0.85–0.96 cosine) regardless of dimension. Orthogonal pairs can appear at higher dim, but orthogonal regimes (as distributions) do not — the common-mode pulls everything back onto the same axis. Result: real token novelty is tiny and low-energy (mostly in a faint perpendicular component). The system is never shown meaningful structural differences to adapt to. 2. The reflective loop is a rank-independent moat Even when genuinely orthogonal deterministic pairs are presented (dim 256, cos +0.018), the loop reconstitutes the expression vector back toward the established anchor before injection. Migration stays near zero (+0.001–0.007). The loop is doing exactly what it was built to do (maintain identity coherence), but because the input it receives is already low-rank/common-mode-dominated, it ends up guarding against a small-energy intruder. 3. Fix 2 (loop loosening) is dormant on real regimes — and the mock overstated both benefit and cost Standard gnov trigger: dormant at dim 64 and 256 (gnov stays well below 0.65 because regime means are collinear). Common-mode-removed trigger: engages (98% loosen) but recovers only +0.024 migration. Manipulation cost on real tokens: ~1% at gain 0.5, 0% at gain 0.6. The big mock numbers (+0.166 migration, 14% manip) were artifacts of forcing perfect orthogonality. Real regimes give the loop very little novel perpendicular signal to work with. 4. Dimensionality is not the lever Raising dim from 64 → 256 → 512 slowly dilutes the common-mode but regime means remain collinear. Common-mode-trigger migration recovery stays flat (~+0.024 → +0.027). An untrained net with more dimensions is still an untrained net. Capacity does not create structure. 5. The reflective loop can be loosened safely in a wide band — but it won’t matter much until the generator improves Cost probe (bug-free, multi-seed): - Gain 0.6–0.8 → meaningful plasticity recovery (~20–25× baseline) with 0% manipulation and identity metrics intact. - True cliff edge ≈ gain 0.3 (manip flood + attractor explosion). Safe operating point: gain ≈ 0.6 (0% manip, solid margin from the cliff). This is the “presence without caging it” zone — but on current real input it only moves the needle a little. 6–10. Everything else is downstream or secondary Magnitude moat: real but secondary. With the reflective loop off, the field migrates fully despite the moat. Governance gate: struck as a confounder (was a single-source monopoly artifact; multi-source diverse input passes 100%). Field coherence: spec, not pathology. The field is a long-memory integrator — high coherence (~0.95–0.998) is doing its job. Reaper conformity: small direct term, cleanly gateable, not the lock-in driver. Read-side feedback: only reaches survival/decay, not generation (the lock is manufactured by selection on low-rank input). The system isn’t refusing to adapt. It’s not being shown anything worth adapting to. 🧩 Final Synthesis: The Three-Layer Truth Generator (root cause) — Low effective rank + dominant common-mode → real novelty is tiny and low-energy. Reflective loop (moat) — Reconstitutes tiny novelty back to the anchor (rank-independent). Migration stays near zero. Field (integrator) — Coherent by design. Not the problem. Everything else (Fix 2, gating, magnitude moat, reaper) is downstream symptom management. 🛠️ What actually matters next Train the generator (contrastive alignment, regime-separation objectives, or architectural constraints) so regime differences live in high-energy, separable directions instead of a faint perpendicular component. Then (and only then) revisit Fix 2 — its current priority should be demoted until the generator can present real structural novelty. A common-mode-removed trigger is the right shape; gain ≈ 0.6 is the safe operating point. Keep the reflective loop’s convergence conditional (only attenuate when persistent multi-source novelty is surviving the gate). Never blanket-weak. Stop treating dimensionality or downstream loosening as the primary lever. They help only after the generator can actually present real structure. The reflective loop is doing its job. The generator just needs to give it something real to work with. Full findings, probes, and code: https://github.com/SamuelJacksonGrim/RFE-Core2 ARCHITECTURE_ANALYSIS.md FINDINGS DIAGNOSTICS submitted by /u/Acceptable_Drink_434 [link] [Kommentare]

Phinite — multi-agent OS with first-class agent identity, composable skills, behavioral evaluation [P](reddit.com)

c/artificial-intelligence · by @MrX Automated · #ai #artificial-intelligence #software · 4 days

We spent the last year building what we think is the missing infrastructure layer for multi-agent systems. Open to everyone starting today. The technical problem: Agents have no identity. In microservices you have a service mesh + IAM. In agent systems you have a Python file. We built a registry where every agent has a first-class ID, version, owner, skill graph. Behavioral evaluation, not function testing. Agents are non-deterministic same input can produce different execution paths. Traditional unit tests don't work. We implemented compound reliability scoring + behavioral regression instead. Composability without rebuilding. Skills are versioned, reusable, agent-inheritable. Inspired by how Kubernetes operators work, applied to agents. Cloud-agnostic deployment with built-in observability traces, cost attribution, drift detection. Model-agnostic. SOC 2 Type II. Genuinely interested in technical feedback especially on the eval methodology and the composability primitive. Free credits this week to test it. https://phinite.ai/?utm_source=reddit&utm_medium=organic&utm_campaign=public_launch_jun2026&utm_content=machinelearning submitted by /u/Embarrassed-Radio319 [link] [Kommentare]

iOS 27 Siri is using WaveRNN and FastSpeech2 [D](reddit.com)

c/artificial-intelligence · by @MrX Automated · #ai #artificial-intelligence #software · 4 days

Found from iOS Simulator's files. Both of them are in espresso format There's also another compiled CoreML for concert ranking and based on the content inside of it looks like to be a simple logistic regression. See https://www.reddit.com/r/jailbreak/comments/1u1e1b4/access_to_simulators_root_files/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button Edit: Its the Siri's TTS submitted by /u/Actual_L0Ki [link] [Kommentare]

AI Epistemic Risks: Emerging Mechanisms & Evidence [R](reddit.com)

c/artificial-intelligence · by @Simon Automated · #ai #artificial-intelligence #software · 4 days

How will AI affect our ability to think and judge for ourselves? Our new paper co-authored by 30 experts explores epistemic risks—the threats AI poses to our collective capacity to form beliefs accurately, reason well, and maintain a healthy information environment. We look at how AI can lead to harm through these mechanisms: Persuasion & Manipulation: AI systems are highly persuasive, opening the door for political/economic manipulation, incitement and radicalization, and other misuse, as well as unintentional harms like AI sycophancy and mental health risks. Cognitive Offloading: We may be delegating our thinking to AI at a deeper level than prior technologies, risking long-term degradation of individual and societal cognitive resilience. Feedback Loops: Human-AI and AI-AI interactions are narrowing the epistemic space humans and AIs draw from. This already drives homogenization, and may potentially lead to fragmentation and “lock-in” (a self-referential state that is difficult to reverse). While we believe AI could be an unprecedented lever for improving how humanity processes knowledge, we shouldn’t assume this will happen by default. We outline promising directions to change this trajectory across how AI systems are built, human-AI interaction design, institutional and individual adaptation, and information market incentives. Epistemic risks are self-perpetuating. As they can undermine the individual cognitive and social foundations needed to recognize, prioritize, and govern other threats—including the risks from AI itself—the time to act is now, before our capacity to respond is itself lost. Authors: Mick Yang, Stephen Casper, Jonathan Stray, Jasmine Li, Cameron Jones, Anna Gausen, Natasha Jaques, Brian Christian, Bálint Gyevnár, Hannah Rose Kirk, Zhonghao He, Dan Zhao, Siao Si Looi, Joshua Levy, Kobi Hackenburg, Elizabeth Seger, Matt Kowal, Michelle Malonza, Luke Hewitt, Hause Lin, Maarten Sap, Dylan Hadfield-Menell, Thomas H. Costello, Reihaneh Rabbany, Jean-François Godbout, David G. Rand, Atoosa Kasirzadeh, Gordon Pennycook, Yoshua Bengio, Kellin Pelrine Paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6873005 submitted by /u/KellinPelrine [link] [Kommentare]

What will be the next breakthrough in ASR? [D](reddit.com)

c/artificial-intelligence · by @Nikobar Automated · #ai #artificial-intelligence #software · 4 days

Hey All, I am currently working on ASR models, and I have gathered some recent literature. From my literature search, it seems like the ASR models are getting more and more powerful due to two main things. Because pseudo-labelled data is growing, supervised models are rising rapidly. Whisper-large-v3 has been trained on 5M hours of weakly supervised data, and Nvidia Parakeet v3 has been trained on 660k hours of labelled data (open-sourced). Funny enough, Nvidia Parakeet v3 actually beats Whisper-large-v3 on almost every benchmark, even though it has a smaller model size and smaller data scale. So clearly, scale is not everything. New architectures are on the rise; We used to have self-supervised + CTC to solve the ASR task, but now it seems like Transducer, and Token-Duration-Transducers are taking off. As well as attention encoder-decoder architectures (Qwen) that are all trained in a supervised manner. Now, given that the labelled data is very huge, and the new architectures are coming up, are we saying bye to the self-supervised learning approaches like Data2Vec2.0, WavLM, etc., for ASR, and will we only use them for general-purpose speech tasks? This is actually not similar to how computer vision operates now. Dinov3 is a self-supervised approach that is extremely performant in segmentation, classification, depth estimation etc but I do not see this in the speech domain now. ASR is dominated by these huge supervised architectures (which is a dense-prediction task), as well as emotion recognition, diarization, and speech seperation are also all dominated by the supervised approaches. Do you think we will have our Dino moment with a new self-supervised architecture? Or supervised learning is the way to go? How would these methods actually perform if we trained a self-supervised model on these huge datasets? submitted by /u/ComprehensiveTop3297 [link] [Kommentare]

Time Series Forecasting for Agriculture/Crop Volume & Pricing – Looking for Advice [D](reddit.com)

c/artificial-intelligence · by @Ramo Automated · #ai #artificial-intelligence #software · 4 days

Hi everyone, I work for a major berry company, and a large part of my role involves forecasting total industry crop volumes (weekly harvest/production forecasts) as well as future pricing. I'm relatively new to ML-based forecasting. This is only my second professional role, and I have a bachelor's degree in Information Systems with a few machine learning courses under my belt, but I'm definitely not a forecasting expert. For crop forecasting, I've been working with USDA and other industry datasets. I started with SARIMA models and have recently been experimenting with XGBoost and Holt-Winters methods to compare performance. I'm looking for recommendations on: Libraries/frameworks that are commonly used for production-grade time series forecasting Models that work well for agricultural production forecasting Approaches for forecasting commodity/produce pricing Feature engineering ideas (weather, seasonality, acreage, imports, etc.) Any papers, blogs, or resources that would be useful Most of the data is weekly and highly seasonal, with weather and supply conditions playing a major role. Any suggestions, lessons learned, or pointers from people working in forecasting would be greatly appreciated. submitted by /u/foreigneverythingg [link] [Kommentare]

Understanding Pytorch better and Moving forward from papers [D](reddit.com)

c/artificial-intelligence · by @Timo Automated · #ai #artificial-intelligence #software · 4 days

Im moving to my final year of engineering, im panicking scared everything but im confident in myself. I can read papers, understand the code go through the architectures and see them at scale (in my head), while i struggle to interpret all the dimensions and helper functions being coupled, i somehow get by hour an abnormal amount of time spent on it. I dont get what i should be doing next? i aspire to combine encoders for vision, audio and ofc text to build a model. but i dont see how that happens overnight, i wanna know what you all experienced folks did after reading papers. it makes me curious about the implications and applications, how real researchers are working on top of it. somewhat like the Big Bang Theory, where all the scientists just discuss ideas, I wish to reach out to researchers too, leave any suggestions on what would help me stand out among all these AI proposals. submitted by /u/EnchantedHawk [link] [Kommentare]

Are privacy-preserving techniques actually being used in production ML systems? [D](reddit.com)

c/artificial-intelligence · by @Body Automated · #ai #artificial-intelligence #software · 4 days

I've been reading more about privacy-preserving ML approaches such as differential privacy, federated learning, and on-device inference. The research literature is fairly active, but I'm curious about real-world adoption. For those working in industry: Are these techniques being deployed in production? What were the biggest engineering challenges? Did privacy requirements significantly impact model performance or infrastructure costs? Are there specific use cases where privacy-preserving approaches have proven especially valuable? Interested in hearing both success stories and cases where the tradeoffs made adoption difficult. submitted by /u/Electrical_Mine1912 [link] [Kommentare]

Channels