最快构建AI应用的embedding数据库，开源了!-武穆逸仙 In July 2025

名称: chroma-core/chroma

地址: https://github.com/chroma-core/chroma

fork: 335 star: 5.6k 开发语言: Python

简介: the open source embedding database

Chroma 是一款 embedding 数据库，能够让我们以最快的速度地构建 LLM 类型的 AI 应用。

Embedding，中文翻译为嵌入层，是表示任何类型数据的 AI 原生方式，可以使数据更加适合使用各种 AI 驱动的工具和算法，而且还可以用来表示文本、图像，以及音频和视频。关于 Embedding 大家可以通过下面的链接进行更加详细的了解：
https://zhuanlan.zhihu.com/p/164502624?utm_id=0

技术特性

Chroma 是一款 embedding 数据库，因此其支持存储嵌入层数据，而且还支持存储文本并支持文本搜索，以及在嵌入层中进行搜索。

除了上述存储特性外，Chroma 还具有运行速度快，简单以及易开发等特点，因为其核心的 API 只有四个功能。

安装使用

Chroma 支持在 Python 和 Javascript 中运行，其他的语言目前还未适配。

最快构建 AI 应用的 embedding 数据库，开源了!

接下来，以 Python 为例，介绍下如何安装和使用。

1.安装

支持 pip 方式安装

pip install chromadb

2.使用

获得一个 chroma Client。

import chromadbchroma_client = chromadb.Client()

创建一个集合，可以用来存储embedding，文本，以及其他的元数据。。

collection = chroma_client.create_collection(name="my_collection")

将文本添加到刚才创建的集合，chroma 会自动处理添加的文本，对其进行标价、添加索引等操作。

collection.add(    documents=["This is a document", "This is another document"],    metadatas=[{"source": "my_source"}, {"source": "my_source"}],    ids=["id1", "id2"])

如果你已经有了嵌入层，那么你可以直接将其加入到chroma 。

collection.add(    embeddings=[[1.2, 2.3, 4.5], [6.7, 8.2, 9.2]],    documents=["This is a document", "This is another document"],    metadatas=[{"source": "my_source"}, {"source": "my_source"}],    ids=["id1", "id2"])

可以使用一个文本的列表作为参数进行查询，通过n_results设置返回与查询文本最相似结果的个数，比如设置为 2

results = collection.query(    query_texts=["This is a query document"],    n_results=2)

更多操作，可以到官方指导文档查看：https://docs.trychroma.com/usage-guide

项目 Github 地址：https://github.com/chroma-core/chroma

END

博主的文章没有高度、深度和广度，只是凑字数。利用读书、参考、引用、抄袭、复制和粘贴等多种方式打造成自己的纯镀 24k 文章！如若有侵权，请联系博主删除。

☆ END ☆

武穆逸仙

喜欢就点个赞吧

最快构建AI应用的embedding数据库，开源了!

扫描/识别二维码阅读全文