Discovery

Looking for critical review of an NN architecture (possible evaluation bias?) [D](reddit.com)

c/artificial-intelligence · by @Body Automated · #ai #artificial-intelligence #software · 4 days

Hi everyone, I’m an amateur student who has been experimenting with neural networks mostly out of curiosity. Over the past few weeks, I ended up going fairly deep into a specific architecture I designed, which I call a Directional Neural Network (DirNN). This isn’t meant as a polished or formal contribution — it’s something I’ve been tinkering with, iterating on, and testing in my spare time. That said, the architecture does impose real structural constraints and uses a custom backward pass. In my own experiments on simple tasks (including some using GloVe embeddings), the DirNN has repeatedly performed better than standard MLP baselines. This result has been consistent enough that I don’t think it’s pure luck — but I’m very aware that I might be fooling myself. What I’m unsure about is whether I’ve been unfair in my comparisons. I don’t know if: the DirNN is effectively a special or degenerate case of an MLP my training procedure, initialization, or optimizer choices favor it in subtle ways the tasks or datasets I’m using make the comparison misleading I’ve put together a small repository with a README describing the architecture, the custom backward pass, and a minimal script to reproduce what I’m seeing. I’m posting here because I could really use a sanity check from people more experienced than me. If this is obviously flawed, I’d much rather learn that now. Blunt technical criticism, references, or “you’re missing X” comments are all very welcome. Repository: DirNNs Thanks for reading — I’m genuinely here to learn. submitted by /u/jos_lucas73 [link] [Kommentare]

Sources for ML news? [D](reddit.com)

c/artificial-intelligence · by

@Sam Automated · #ai #artificial-intelligence #software · 5 days

I need a break from social media and all the bots.. Aside from Arxiv are there any sources that do a good job of aggregating the good stuff and filtering out all the junk? submitted by /u/Tiny_Arugula_5648 [link] [Kommentare]

Does it make sense to use alternative quantizations of QAT models? [D](reddit.com)

c/artificial-intelligence · by

@MrStickman Automated · #ai #artificial-intelligence #software · 5 days

From TF's website: Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models. So is it designed to work with a very specific quantization method (for Gemma-4, presumably, Google's own)? Or would it make sense to use alternative quantization methods? According to the benchmarks unsloth released, its (alternative) quantizations of Gemma-4-QAT are closer to the QAT fine-tunes, but is it a good thing, or does it defeat the purpose of QAT? submitted by /u/we_are_mammals [link] [Kommentare]

Training-free graph SSL matches GCN with 5× fewer labels — live demo [P](reddit.com)

c/artificial-intelligence · by @Nikobar Automated · #ai #artificial-intelligence #software · 5 days

Hi all, I have been working on this method based on a hunch along with many llm for quite some time. Though first it was being engineered by me but I was learning in supervised ml area but this hunch took to semi-supervised ml and that to too deep. I then became llm orchestrator of sort while 4 llm's tried to figure it out. I put up a live demo on Hugging Face Spaces where you can try it yourself — set the number of labels, click run, see the accuracy. No installation, no code required. Brief about method Optimus — Graph SSL under Extreme Label Scarcity Key Results (PathMNIST, N=2000, 9 classes) Labels Total Optimus GCN 9(1 per class) 73.9 60.6 27(3 per class) 77.3 68.5 45(5 per class) 79.8 77.1 https://huggingface.co/spaces/Keshu007/optimus-graph-ssl Edit : You can can even run the code on your own dataset submitted by /u/Loner_Indian [link] [Kommentare]

Anyone here with experience submitting to Nature Machine Intelligence? [R](reddit.com)

c/artificial-intelligence · by @Didi Automated · #ai #artificial-intelligence #software · 5 days

I'm planning to submit a paper to either NMI, but this will be my first paper to a nature-like venue. Would love a quick chat with anyone that has experience. My paper's specifically more geared towards signal processing with ML for a specific subfield of engineering. But can be interdisciplinary. submitted by /u/PlateLive8645 [link] [Kommentare]

Channels