InFeeo
Language

SWE Marathon(swe-marathon.org)

×
Link preview SWE-Marathon 20 multi-hour SWE tasks spanning library reproductions, full-stack product clones, and ML engineering. 1,300 logged trials; frontier configs stay below 19% task resolution. SWE-Marathon · swe-marathon.org
20 multi-hour SWE tasks spanning library reproductions, full-stack product clones, and ML engineering. 1,300 logged trials; frontier configs stay below 19% task resolution.

Comments

Log in Log in to comment.

No comments yet.