InFeeo
Language

Popping the GPU Bubble(github.com)

×
Link preview GitHub - m87-labs/moondream: tiny vision language model tiny vision language model. Contribute to m87-labs/moondream development by creating an account on GitHub. GitHub · github.com
Photon, Moondream's inference engine, achieves near-realtime VLM inference (~33ms on NVIDIA B200). This is a peek into how it delivers up to 35% higher decode throughput by optimizing how the GPU works.

Log in Log in to comment.

No comments yet.