InFeeo
United States
ai
New
Language
Building an Open Source Edge Semantic Cache for LLMs in Rust/WASM – Sanity check on the architecture? [D](reddit.com)
Hey everyone, I am planning out a new open-source infrastructure project and want to get some brutal feedback on the architecture and use-case validity from people running high volume LLM workloads in production. The Problem: Python-based proxies/gateways introduce too much latency overhead for real-time streaming agent steps or fast UI completions. Additionally, centralized semantic caching still suffers from cross-region network latency (e.g., London to us-east-1), and enterprise API costs remain a massive bottleneck for repetitive/predictable user queries (like customer support or structured data extraction). The Proposed Architecture: Instead of a heavy centralized gateway, the goal is to build a lightweight, zero-dependency semantic cache running directly at the CDN Edge using WebAssembly (WASM) compiled from Rust. The flow looks like this: Inbound Prompt: Hits the edge node closest to the user (e.g., Cloudflare Workers / Fastly Compute). Edge Embedding: The Rust/WASM module intercepts the raw text prompt and instantly generates a vector using an edge-native lightweight model (e.g., bge-small-en-v1.5). Similarity Index Check: It performs a fast cosine similarity check against an edge vector database (like Cloudflare Vectorize) to find the nearest semantic neighbor. Cache Hit: If similarity >= threshold (e.g., 0.88), it pulls the full generated response text from an edge KV store and returns it in ~5ms. The main LLM provider is never billed or touched. Cache Miss: It proxies the streaming request to OpenAI/Anthropic/vLLM, streams it back to the client, and asynchronously updates the edge vector index and KV store. Why Rust/WASM? To achieve sub-millisecond execution overhead on the proxy itself, avoid garbage collection pauses, and maintain a tiny memory footprint suitable for edge runtime constraints where traditional databases or Python scripts cannot run. My Questions for the Community: For those running LLMs in production (especially customer support, internal RAG, or autonomous agents), what is your realistic semantic cache hit rate? Is the power law of repetitive queries high enough in your domains to justify this? What are the biggest footguns with semantic caching at the edge? (e.g., Cache invalidation strategies, handling system prompt updates, or drift in embedding models). Would you actually use a drop-in open-source template/CLI that lets you spin this up on your own edge account, or do you prefer centralized API gateways? submitted by /u/Real-Huckleberry-934 [link] [Kommentare]
Has anyone built a GOOD map of European physical AI ventures? 🇪🇺 🦾(reddit.com)
I had a first go, putting together some of our friends in the space + a bit of research. It’s inspiring to see this vertical grow while everyone complains Europe is dead in tech. You do not need to live in SF or Shenzen to build with robots. You just need good engineers and a high tolerance for pain. There is lot of heavy metal waiting to wake up in Europe. Cyberwave Mirai Robotics Alto Robotics Fluid Wire Robotics Caracol AM ANYbotics Niulinx NEURA Robotics Generative Bionics Pipein Wearable Robotics Enchanted Tools Flybotix Quantum Systems Wandercraft Voliro Exotec Automata Agreenculture Reactive Robotics Verne EasyMile Inbolt https://preview.redd.it/i1ivxo1r1t6h1.png?width=2220&format=png&auto=webp&s=b49920fd224ce5e23121d673f33a78aa8174cecd Who’s missing? Feel free to tag your venture in the comments. Also: I’ll put the link to the database in the comments if anyone wants to contribute to the map and then I’ll happily publish a v2 🫡 Rough visual made with Claude Code can’t wait to see more logos on it. submitted by /u/Erlapso [link] [Kommentare]
hubert.cpp, a C++ implementation of distilHuBERT [P](reddit.com)
I've written a C++ implementation of distilHuBERT. https://github.com/pfeatherstone/hubert.cpp It has no runtime dependencies, the weights are compiled into the library, it supports dynamic sizes, has performance on par with onnxruntime (in my tests) and can be easily integrated into any CMake project. Please let me know your thoughts. submitted by /u/Competitive_Act5981 [link] [Kommentare]
5 ICML papers in 5 months [D](reddit.com)
“…5 papers at ICML (1 Spotlight)…” “…Five ICML papers is what a strong PhD produces in four years. I did it in five months…” I recently saw these posts from people at the same AI company. At first, I was extremely surprised. It turned out they were workshop papers. Am I missing something here, or are workshop papers now being treated as equivalent to main-track papers? submitted by /u/Terrible-Chicken-426 [link] [Kommentare]
Check out Multi-Objective Intelligent Industrial Robot Calibration Using Meta-Heuristic Optimization Approaches(reddit.com)
Hi everyone, I wanted to share our latest open-access paper published in the journal Robotics: Multi-Objective Intelligent Industrial Robot Calibration Using Meta-Heuristic Optimization Approaches. The Problem Traditional industrial robot calibration heavily focuses on a single goal: maximizing absolute end-effector position accuracy. However, purely optimizing for position errors often results in the algorithm recommending unrealistic, drastic shifts to the robot’s physical kinematic structure (its Denavit–Hartenberg parameters). This creates a stark deviation from the manufacturer's nominal specifications and can degrade performance across different areas of the workspace. Our Approach We framed this challenge as a multi-objective optimization problem to strike a balance between two competing goals: Position Accuracy: Minimizing discrepancies using joint angle readings and a high-precision laser tracker (LT). Kinematic Realism: Minimizing the mean absolute deviation of the calibrated DH parameters from the manufacturer's original design specs. To find the optimal trade-off, we deployed and benchmarked several leading evolutionary and swarm optimization algorithms: NSGA (Nondominated Sorting Genetic Algorithms) MOEA/D (Multi-Objective Evolutionary Algorithm based on Decomposition) MOPSO (Multi-Objective Particle Swarm Optimization) Key Takeaways Utilizing a multi-objective framework prevents overfitting to specific target points and keeps the structural kinematic parameters physically viable. Swarm and evolutionary approaches excel at generating an adaptable Pareto front, giving automation engineers finer control over calibration tradeoffs. The full methodology, mathematical formulations, and comparative results are available to read for free on the MDPI Robotics Publication Page. I would love to hear the community's thoughts on using meta-heuristics for kinematic calibration, or answer any questions you might have about our experimental setup and algorithm performances! submitted by /u/MAK42018 [link] [Kommentare]
How do I generate /odom from BLDC hub motor hall sensors?(reddit.com)
I'm building an autonomous rover using ROS2. For mapping, I'm using SLAM Toolbox, and my goal is to navigate the rover autonomously. My rover uses BLDC hub motors (the type of wheel in the picture) that have built-in hall sensors. However, I'm confused about how to generate the /odom topic required by SLAM Toolbox using these hall sensors. From what I understand, SLAM Toolbox needs odometry data, but I'm not sure: How to convert hall sensor readings into wheel odometry. How to calculate wheel position, velocity, and robot pose from the hall sensor data. Whether hall sensors alone are accurate enough for odometry. If there are any ROS2 packages or existing solutions that can help with this. Has anyone implemented odometry using BLDC hub motor hall sensors in ROS2? Any examples, tutorials, or advice would be greatly appreciated. submitted by /u/Organic-Author9297 [link] [Kommentare]
Swarm Robotics: a beginner-friendly lecture on coordination, decentralization, and collective behavior(reddit.com)
I made a chapter in my Advanced Robotics course about swarm robotics, focusing on the main ideas behind multi-robot coordination rather than treating it as just a buzzword. The video covers topics like: what makes a robot group a “swarm” decentralized vs. centralized coordination local rules and emergent global behavior examples inspired by ants, birds, and collective systems why scalability and robustness are important in swarm robotics I’m sharing it as a learning resource for students or beginners who are trying to understand where swarm robotics fits inside robotics and multi-agent systems. Video: https://www.youtube.com/watch?v=EXH3NpsKtUc I also keep the related course materials and source codes here, for anyone who prefers to learn by reading or experimenting with code: https://github.com/mohammadijoo/Control_and_Robotics_Tutorials For people working in robotics/control: what topics do you think should be added to make a swarm robotics lecture more useful — communication models, formation control, task allocation, path planning, or real hardware examples? submitted by /u/abolfazl1363 [link] [Kommentare]