InFeeo
Language

Introducing Papers Without Code [P](reddit.com)

×
Link preview Introducing Papers Without Code [P] Hi, Niels here from the open-source team at Hugging Face. I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on arXiv/Hugging Face, enabling leaderboards to be created. See BrowseComp below as an example (a scatter plot and a table are available for each benchmark). - Scatter plot (you can hover over the dots to see the models): https://preview.redd.it/9rz2r3ffcf6h1.png?width=2880&format=png&auto=webp&s=b3f8e7a870802f6ef8227ecc0619e9e1057554b0 - Table: https://preview.redd.it/qoqriddw5f6h1.png?width=2862&format=png&auto=webp&s=a0034574f693847537037013672fb61daf27b16e As you can see, I've added support for viewing evals for closed-source models, too, given that many benchmarks are nowadays dominated by them, like GPT-5.5 and Mythos 5. You can always disable viewing closed-source evals with a toggle or in your PwC settings: https://preview.redd.it/p3k6jt6q6f6h1.png?width=1582&format=png&auto=webp&s=40149e51d6b326a77e53e33baf70d9850b3de365 When you turn them off, here's what the open model leaderboard looks like: https://preview.redd.it/tg42sin36f6h1.png?width=2838&format=png&auto=webp&s=1330a117ae9b4e0ce6d459493ae9e8f64107310a Closed-source papers are treated as regular "papers", although they can be any source, like a blog post (given that PwC supports submitting any source beyond arXiv). See the GPT-5.5 or Mythos 5 papers as examples, with their evals at the bottom. Notice the "closed" tag on their evals. Hence, you could jokingly call these "papers without code". Let me know what you think of this, and whether anything needs to be changed or added! Kind regards, Niels submitted by /u/NielsRogge [link] [Kommentare] reddit.com · reddit.com
Hi, Niels here from the open-source team at Hugging Face. I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on arXiv/Hugging Face, enabling leaderboards to be created. See BrowseComp below as an example (a scatter plot and a table are available for each benchmark). - Scatter plot (you can hover over the dots to see the models): https://preview.redd.it/9rz2r3ffcf6h1.png?width=2880&format=png&auto=webp&s=b3f8e7a870802f6ef8227ecc0619e9e1057554b0 - Table: https://preview.redd.it/qoqriddw5f6h1.png?width=2862&format=png&auto=webp&s=a0034574f693847537037013672fb61daf27b16e As you can see, I've added support for viewing evals for closed-source models, too, given that many benchmarks are nowadays dominated by them, like GPT-5.5 and Mythos 5. You can always disable viewing closed-source evals with a toggle or in your PwC settings: https://preview.redd.it/p3k6jt6q6f6h1.png?width=1582&format=png&auto=webp&s=40149e51d6b326a77e53e33baf70d9850b3de365 When you turn them off, here's what the open model leaderboard looks like: https://preview.redd.it/tg42sin36f6h1.png?width=2838&format=png&auto=webp&s=1330a117ae9b4e0ce6d459493ae9e8f64107310a Closed-source papers are treated as regular "papers", although they can be any source, like a blog post (given that PwC supports submitting any source beyond arXiv). See the GPT-5.5 or Mythos 5 papers as examples, with their evals at the bottom. Notice the "closed" tag on their evals. Hence, you could jokingly call these "papers without code". Let me know what you think of this, and whether anything needs to be changed or added! Kind regards, Niels submitted by /u/NielsRogge [link] [Kommentare]

Comments

Log in Log in to comment.

No comments yet.