a tiny vision language model that kicks ass and runs anywhere
| Image | Example |
|---|---|
![]() |
What is the girl doing? |
The girl is sitting at a table and eating a large hamburger.
What color is the girl's hair?
The girl's hair is white. |
|
| What is this?
This is a computer server rack, which is a device used to store and manage multiple computer servers. The rack is filled with various computer servers, each with their own dedicated space and power supply. The servers are connected to the rack via multiple cables, indicating that they are part of a larger system. The rack is placed on a carpeted floor, and there is a couch nearby, suggesting that the setup is in a living or entertainment area.
What is behind the stand?
Behind the stand, there is a brick wall. |
Moondream is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint. It's designed to be versatile and accessible, capable of running on a wide range of devices and platforms.
The project offers two model variants:
Moondream can be run locally, or in the cloud. Please refer to the Getting Started page for details.
$ claude mcp add moondream \
-- python -m otcore.mcp_server <graph>