
The multi-model database optimised for knowledge search and generative AI

Get started!
Getting started with NucliaDB is easy. You can install it locally using docker or pip, and once it’s up and running, you can start using it by installing the nucliadb-dataset and nucliadb-sdk libraries.
1. Install NucliaDB and run it locally
pip install nucliadb
nucliadb
2. Create your first KnowledgeBox
A KnowledgeBox is a data container in NucliaDB. with just a few lines of code, and start filling it with data.
from nucliadb_sdk.utils import create_knowledge_box
my_kb = create_knowledge_box("my_new_kb")
3. Upload data
from nucliadb_sdk.knowledgebox import KnowledgeBox
from sentence_transformers import SentenceTransformer
encoder = SentenceTransformer("all-MiniLM-L6-v2")
resource_id = my_kb.upload(
key="mykey1",
binary=File(data=b"asd", filename="data"),
text="I'm Sierra, a very happy dog",
labels=["emotion/positive"],
entities=[Entity(type="NAME", value="Sierra", positions=[(4, 9)])],
vectors={"all-MiniLM-L6-v2": encoder.encode(["I'm Sierra, a very happy dog"])[0]},
)
uknowledgebox[resource_id] == knowledgebox["mykey1"]
4. Search
4.1. Semantic search
from nucliadb_sdk.knowledgebox import KnowledgeBox
from sentence_transformers import SentenceTransformer
encoder = SentenceTransformer("all-MiniLM-L6-v2")
query_vectors = encoder.encode(["To be in love"])[0]
results = my_kb.search(vector = query_vectors, vectorset="all-MiniLM-L6-v2",min_score=0.25)
Iterate over the results:
for result in results:
print(f"Text: {result.text}")
print(f"Labels: {result.labels}")
print(f"Score: {result.score}")
print(f"Key: {result.key}")
print(f"Score Type: {result.score_type}")
print("------")
The results:
Text: love is tough
Labels: ['negative']
Score: 0.4688602387905121
Key: a027ee34f3a7489d9a264b9f3d08d3a5
Score Type: COSINE
------
Text: he is heartbroken
Labels: ['negative']
Score: 0.27540814876556396
Key: 25bc7b22b4fb4f64848a1b7394fb69b1
Score Type: COSINE
4.2. Full text search
from nucliadb_sdk.knowledgebox import KnowledgeBox
results = my_kb.search(
text="dog"
)
Iterate over the results:
for result in results:
print(f"Text: {result.text}")
print(f"Labels: {result.labels}")
print(f"Score: {result.score}")
print(f"Key: {result.key}")
print(f"Score Type: {result.score_type}")
Get results:
Resource key: 4f1f570398c543e0b8c3b86e87ee2fbd
Text: Dog in catalan is gos
Score type: BM25
Score: 0.8871671557426453
Labels: ['neutral']
Resource key: 665e85f0fb2e4b2fbde8b4957b7462c1
Text: I'm Sierra, a very happy dog
Score type: BM25
Score: 0.7739118337631226
Labels: ['positive']
4.3. Search by label
results = my_kb.search(
filter=["emotion/positive"]
)
Get results:
for result in results:
print(f"Resource key: {result.key}")
print(f"Text: {result.text}")
print(f"Labels: {result.labels}")
Results:
Resource key: f1de1c1e3fac43aaa53dcdc54ffd07fc
Text: I'm Sierra, a very happy dog
Labels: ['positive']
Resource key: b445359d434b47dfb6a37ca45c14c2b3
Text: what a delighful day
Labels: ['positive']
Main features

It’s a cloud-native database
Install NucliaDB in multiple cloud storage providers such
Amazon S3, Google Cloud Storage, Azure File Storage, or
Alibaba file cloud storage.

Ultra-high read performance
NucliaDB offers an ultra-high read performance to provide queries at scale.

Multimodel indexing
One database, multiple indexes.
- Vector indexing
- Paragraph indexing
- Full text indexing
- Relation indexing

Open Source
NucliaDB is an open-source project open to external developers.