Channels
Is it normal to use different styles of figures (colours, backgrounds, grids, etc.) when writing a paper? Personally, I think it looks unprofessional. submitted by /u/Few-Annual-157 [link] [Kommentare]
hi everyone, I created a blog around how I started open source contribution, documented all minute details. Please give it a read and give review as this is my journey to do blogging for the first time. It is free! https://substack.com/home/post/p-200202050 submitted by /u/DqDPLC [link] [Kommentare]
Yes, I'm calling it out. It IS racism. As an active member of r/MachineLearning and a researcher who is ethnic Chinese, I am DISGUSTED by unfounded accusations against the group of researchers who constitute over half of the field. Such posts pop up every other week, grounded in conspiracy theories, and creating a sinophobia echo chamber. I understand the salty feeling when one's paper is rejected, no matter whether the paper actually deserves acceptance or not. Given the noise in conference organization and reviewing process, and a relatively junior body of participants, it is very likely that one finds a paper "worse than mine" slip into the conference, and there's a high chance that the paper has a Chinese author. That's simply because of the composition of the authors, and does not warrant accusations, aka witch hunts, towards certain ethnic groups. This sub is about an important scientific subject in the modern world. If anyone agrees with the logic "80% of the authors are Chinese, so my rejection is their fault.", they should seriously rethink their career plan since such thinking does not belong to serious scientists. We should be open to discussing the problems we have in the current conference organization and reviewing process, but racism should not have a foothold in our field. submitted by /u/AffectionateLife5693 [link] [Kommentare]
Hi r/MachineLearning, Wanted to share something I'm excited about. I’ve been fascinated by AlphaEvolve and its results for more than a year now, but using open source frameworks seems overwhelming because of the high costs. I can’t really afford hundreds of Claude Opus calls every time I want to run it. I want to be able to try it out many times and all sorts of unique domains. What if it was possible for AlphaEvolve to be much more affordable while getting a better performance? Over the last six months or so, I’ve been working on LEVI, an open source AlphaEvolve-like system that can outperform existing open source frameworks at a fraction of the cost (upto 35x cheaper!). It can also run on Claude Code or Codex, making it even more accessible (I've mostly been using it with a QWEN-30B). LEVI comes in two flavors where I felt it’ll make the most difference: Code Optimization, and Prompt Optimization (sorry math, you got a less direct path; workable through the code route). The core thesis behind LEVI is that with the right search architecture, smaller models can substitute for or outperform larger ones. This means it’s much more economical to rely on smaller models for most of the work. That’s the entire takeaway. Making this work in practice is a different problem, but if you forget everything else from this post this is the only message I think I’m really trying to convey here. LEVI does it in three ways: 1) Invest in solution diversity from the start and ensure its maintained. We don’t want to converge to the same solution, especially with smaller models in the mix, and rely on large models to pull us out of the basin. 2) Use smarter routing across larger and smaller models (i.e. most mutations don’t require a Claude Opus X) 3) For prompt optimization not every rollout is as important. Build a proxy subset to approximate. I’ve tried LEVI on systems problems (like MoE scheduling or database transaction scheduling) and found that LEVI outperforms existing frameworks on almost every problem I threw at it while consistently using a smaller budget (unto 7x cheaper). For prompt optimization, across problems like IFBench and HotSpotQA, LEVI reaches a similar or better score as GEPA while using less than half the rollouts! Happy to answer any questions or take any suggestions! If there are unexpected or niche domains where this can be applied, I would love to hear. Technical Blog: https://ttanv.github.io/levi/ GitHub: https://github.com/ttanv/levi submitted by /u/Longjumping-Music638 [link] [Kommentare]
I've been building agents for about a year and recently shipped one for a client running ~140 MCP-exposed tools at peak. Along the way I made the canonical mistake. I used cosine similarity over tool description embeddings to pick which tools the model could see per turn. Worked great in demos. Was actively dangerous in production. Here's the problem. In a basic semantic-ranking setup you embed the user query, embed every tool description once, and rank by cosine similarity at runtime. That works for general document retrieval where chunks are paragraph-length, semantically rich, and roughly equal in form. Tool descriptions are not that. They are short (often
hi,where can i find Maven AI Evals for Engineers & PMs and End-to-End AI Engineering Bootcamp videos.They are too costly.cant afford them.Can anybody help me in finding the resoursec for them? submitted by /u/Zestyclose_Block5381 [link] [Kommentare]
ArXiv has an endorsement system for a reason. I would only offer endorsement to whom I have direct academic collaboration or mentorship with, since I'm putting my own academic reputation on the stake. This is also the standard of almost any serious academic researcher I am aware of. Now ArXiv is making effort to crack down AI slop and banning accounts uploading low-quality research papers, which is a great initiative. By definition of an "endorsement", I wish ArXiv could backtrack and at least issue warnings to their endorsers, and if this happens multiple times (let's say three), people giving out careless endorsement should also face consequences. submitted by /u/AffectionateLife5693 [link] [Kommentare]
There are coordinated efforts where people have favoured and jeopardised the double blind review process. No doubt out of these 80% there are great talent but we have to acknowledge that non chinese have been sobotaged and this was also reflected in the recent leaks of the reviewer data from the top ml conferences (won’t name them but they start with i). I have also personally faced such discrimination and had a discussion on the subreddit asking others if they have witnessed something similar. It was shocking to know that this is occurring on large scale. The question is how do we stop it, or highlight this? We have to preserve the sanctity of the research. submitted by /u/AppropriatePush6262 [link] [Kommentare]
I run evaluations on generative image models as part of my workflow, mostly comparing coherence, prompt adherence, and compositional accuracy across different architectures. The consensus here seems to be that open models are still a generation behind closed APIs. Based on my recent benchmarks, that gap is way smaller than people assume. On compositional control specifically, the latest open checkpoints handle multi-object scenes with spatial relationships about as reliably as the paid endpoints I've tested. Not perfect, but close enough that the failure modes are comparable. The thing that surprised me was text rendering in images, which used to be a disaster on open models. Recent architectures actually get it right roughly 70-80% of the time on short strings. Generation speed is another misconception. People complain about inference time but I'm getting 2MP outputs in under two minutes on a single consumer GPU. Drop resolution and step count and you're at 30 seconds. Fine for iteration. The structured prompting argument also falls flat. Everyone acts like having explicit scene control is a downside when it's literally what production pipelines need. Unstructured text prompts are the hack, not the other way around. These models ship without community optimizations, no fine-tuning, no custom pipelines. The baseline is already competitive. submitted by /u/ProfessionalAnt7436 [link] [Kommentare]
With more software engineers entering into data science and AI, I feel it's equally important for a person with data and AI background to dive into software development to survive, thrive in industry. I Know it's a very broad question, so suggestions with broad subjects, topics are welcome , like I often wonder how DSA is relevant. I totally understand the needs of the skills are deeply coupled with domain, industry and specific problems but unfortunately the industry doesn't understand this, it judges you, rewards you based on what you already know or pretend rather than your ability to learn or adapt. submitted by /u/Dapper_Chance_2484 [link] [Kommentare]
If ICML conference paper is rejected and no one opts-in or opts-out to keep the reviews visible, will the reviews be visible to everyone? There was clear instruction that only papers with at-least 1 opt-in AND zero opt-out options will be visible. None of the authors selected any option, But it in my openreview profile, it shows visible to everyone. please clarify. submitted by /u/Curious-Monitor497 [link] [Kommentare]
Im an undergraduate studying CS at a state school in the US. I’m interested in researching a specific style of self supervised learning (JEPA) and want to eventually go to grad school to study further. I have experience working in a lab similar to this topic, and I’ve become fairly comfortable with the literature and have a basic understanding of what its going on, but right now km only doing applied research in a specific domain (physics). I hope to eventually go to grad school to study this. But right now my opportunities are kinda limited as my school’s CS department is pretty mid. I was wondering if y’all have any advice on how to approach things? I know i can perform research independently but its not ideal due to: 1. Limited compute, less resources compared to a proper lab 2. Lack of a supervisor/guidance on the nuances of the field My current lab would be supportive if i do try to do things, but pure ml research is not really their main thing. I’ve heard people do REUs or cold email profs. But Im not sure if i could find something that specifix in an reu (also am international). And the labs i have seen working in this are either private or quite prestigious so im not sure how far cold emailing would take me. Sorry for the long post. Tldr; want to do pure ml research but theres no existing lab/professor at my current school who does something similar, wondering if any other pathways exist Any advice would be appreciated thanks submitted by /u/QuickStar07 [link] [Kommentare]
Hi folks, Deciding between these two Mac options has been a challenge for me, so pls help. I know mac is not even necessary for this but just help me to decide between these two options. For the reference, Im a swe student and looking forward to go deep into ml and data science in the near future… EDIT: mac book pro m5 ( base chip) that I’m referring here. submitted by /u/Both-Hovercraft3161 [link] [Kommentare]
Hi everyone, I'm an undergraduate student and ML researcher at UC Berkeley. My colleagues and I are working on a project that hopes to fix some of the problems users face with Colab. What are the features you wish it had as an ML professional, researcher, or enthusiast? What're the biggest problems you've faced while using it? Some of the issues that everyone feels (including us) is environment management and kernel persistence. But we would love to hear more from the community. submitted by /u/myplstn [link] [Kommentare]
Hey everyone, hope this is okay to post here. My co-author and I are currently between institutional affiliations, which means we don't have the academic email arXiv needs for an endorsement. We're hoping to find someone in cs.CV willing to take a quick look at our paper and endorse it if it meets your bar. The project: Locate-SAM2 We built a training-free pipeline connecting NVIDIA's LocateAnything-3B to Meta's SAM 2.1 through a lightweight adapter. The question we wanted to answer was simple: in a modular text-to-mask pipeline where everything is frozen, does the choice of grounder actually matter for the final mask? A few specifics, since the details are what tell you we're not just generating noise: On RefCOCO val, our system reaches 0.772 mIoU versus 0.717 for Grounding DINO Base, using the same SAM 2.1 backend throughout. RefCOCO appears in LocateAnything's training data, so we frame this honestly as in-domain benchmarking, not zero-shot transfer. We're not pretending otherwise. The paper has controlled comparisons across RefCOCO/+/g, adapter ablations, a ground-truth box oracle, a failure taxonomy, and a nonsense-prompt probe showing the pipeline needs abstention logic. Code is on GitHub and the paper is close to submission-ready. What we're hoping for Mainly an endorsement: someone to read the draft and, if they think it holds up, endorse us on arXiv. We'd acknowledge it and that's the whole ask. If anyone wants to get more involved, we're open to expanding the experiments or pointing the paper at a specific venue, and we'd talk co-authorship based on real contribution. We also have separate work in progress in physically-constrained DL, geospatial AI, and AI governance, in case any of that overlaps with what you do. We're not looking for a blind voucher. Drop a comment or a DM and we'll share the PDF and the repo. Happy to answer questions, and thanks for reading. submitted by /u/j_root_ [link] [Kommentare]