How to build a search engine with Next.js and Nuclia

Next.js is a React framework that allows you to build server-side rendered applications with React. It is a great tool to build static websites, but it also allows you to build dynamic websites, with server-side rendering and static generation.

What are your options if you want to offer a search feature in your application?

With Next.js and NodeJS, you can definitely implement a very naive search engine able to find a word in any page of your website. But if you expect anything smarter than just exact word matching and case-insensitiveness, it will be tough. Now, imagine if your web site contains videos, PDFs or audio files and you want their content to be searchable as well… That’s where Nuclia comes in!

Nuclia is an API able to index and process any kind of data, including audio and video files, to boost applications with powerful search capability. Nuclia uses natural language processing and machine learning to understand the searcher’s intent and return results that are more relevant to the searcher’s needs.

Using the Nuclia widget: as simple as copy/paste

To add Nuclia search feature to your Next.js application, you need to create a Nuclia account. You can do it here.

Nuclia manages contents in knowledge boxes. When creating your account, Nuclia automatically creates a default knowledge box for you.

After completing the account creation, you will be redirected to the Nuclia dashboard where you can manage your knowledge box. As you want to allow visitors to run search on your website, you must make your knowledge box public. To do so, click on the Publish button on the top right of the page.

If you go to the “Widgets” entry in the left menu, you can create a new search widget for your Next.js application.
Let’s call it nextjs-search-widget.

Change the mode from input to form, and save it.

It generates a code snippet similar to:

<script src=""></script>

You just need to copy/paste it in your Next.js application to get your search feature up and running.

Indexing page contents automatically

As your knowledge box is empty, this is for now a very disappointing feature 🙂

Of course, you can use the Nuclia Dashboard to index files or web pages by yourself. That’s nice for testing purpose, but it would be much better if you could index your Next.js pages automatically.

Let’s do a NodeJS script that will collect all Markdown files in your Next.js application and index them in your knowledge box.

First you need to install the Nuclia SDK and the dependencies allowing to use it in NodeJS:

npm install @nuclia/core localstorage-polyfill isomorphic-unfetch
# OR
yarn add @nuclia/core localstorage-polyfill isomorphic-unfetch

Here is how a typical NodeJS script can use the Nuclia API:

const { Nuclia } = require("@nuclia/core");

const nuclia = new Nuclia({
  backend: "",
  zone: "europe-1",
  knowledgeBox: "<YOUR-KB-ID>",
  apiKey: "<YOUR-API-KEY>",

// code to push data to Nuclia (detailed later)

As you can see, you need to provide a Nuclia API key. An API key is necessary when adding or modifying contents in a knowledge box. You can get your API key in the Nuclia Dashboard, in the “API keys” section:

  • Create a new Service Access (name it nodejs-upload for example) with Contributor role
  • Click on the + sign to generate a new token for this service access
  • Copy the generated token and paste it in your NodeJS script

Then you can write a script named upload-posts.js that will index all the Markdown files from ./pages/posts:

const fs = require("fs");
const path = require("path");
const { Nuclia } = require("@nuclia/core");

const nuclia = new Nuclia({
  backend: "",
  zone: "europe-1",
  knowledgeBox: "<YOUR-KB-ID>",
  apiKey: "<YOUR-API-KEY>",

const uploadPosts = (kb) => {
  // Get posts
  const postsDir = path.join(process.cwd(), "pages", "posts");
  const posts = fs.readdirSync(postsDir);

    .filter((post) => post.endsWith(".mdx"))
    .forEach((post) => {
      const postPath = path.join(postsDir, post);
      const postContent = fs.readFileSync(postPath, "utf8");
      const postTitle = postContent.split("\n")[0].replace("# ", "");
      const postSlug = post.replace(".mdx", "");

      // Upload post to Nuclia
      const resource = {
        title: postTitle,
        slug: postSlug,
        texts: {
          text: {
            format: "MARKDOWN",
            body: postContent,
        next: () => console.log(`Uploaded ${postSlug} to Nuclia`),
        error: (err) => console.error(`Error with ${postSlug}`, err),

nuclia.db.getKnowledgeBox().subscribe((kb) => uploadPosts(kb));

This script does the following:

  • iterate on the .mdx files in ./pages/posts,
  • for each file, extract its markdown content, and get its title from its first line,
  • and then upload it to Nuclia using the createResource method.

You can run this script with:

node upload-posts.js

Now if you check your Nuclia Dashboard, you should see your posts uploaded in your knowledge box!
If they are marked with a yellow dot in the resource list, it means that they are still being processed. When the processing is fully done, the dot will turn green and the corresponding resource can be searched from the search widget.

Indexing external links and media files

Nuclia can index any kind of data, not just text. Let’s say you have some posts containing links to YouTube videos or to local PDF files.

It would be nice to make their content searchable too.

So what about finding in the blog posts any link to local files or to external web pages and index them?

First you need to find links in markdown files. They are always written like [some-title](some-url). So you can use a regular expression to extract them:

const markdownLinks = [...postContent.matchAll(/\[.*?\]\((.*?)\)/g)].map(
  (match) => match[1]

Then we have 2 cases:

  • The link starts with http: it is an external link, so you will add it to the resource as a link field.
  • The link starts with /media: it is a media file, so you will add it to the resource as a file field.

link fields can be added directly in the creation payload just like you did with the text field in the previous step:

const links = markdownLinks
  .filter((link) => link.startsWith("http"))
  .reduce((all, link, index) => {
    all[`link-${index}`] = { uri: link };
    return all;
  }, {});
const resource = {
  title: postTitle,
  slug: postSlug,
  texts: {
    text: {
      format: "MARKDOWN",
      body: postContent,

Note: the nice thing about link fields is Nuclia will automatically choose the right thing to index: if it is a regular web page, it indexes its text content, but if it is a YouTube video, it will index the video itself, not the YouTube page.

At the contrary, file fields cannot be added directly because they are binaries, so you need to get the resource once created and then use its upload() method.

As it involves asynchronous operations, you need to install rxjs:

npm install rxjs
# OR
yarn add rxjs

Then you can write the following code:

const localFiles = markdownLinks.filter((link) => link.startsWith("/medias"));
kb.createResource(resource, true).pipe(
  switchMap((data) =>
    localFiles.length > 0
      ? kb.getResource(data.uuid, [], []).pipe(
          switchMap((resource) =>
     => {
                const filePath = path.join(process.cwd(), "public", file);
                const fileContent = fs.readFileSync(filePath).buffer;
                const fileName = file.split("/").pop();
                return resource.upload(fileName, fileContent);
      : of(true)

And now, by running the script again, you should see your posts with their links and media files indexed in Nuclia.

That’s it! You can now search in your blog posts from the search widget!

Nuclia widget

Implementing a custom search component

Ok, so far you have seen how to use the Nuclia SDK to index data in Nuclia and you use the Nuclia search widget to provide a search experience to your users.

But what if this widget is not the perfect fit for your app?

You can definitely implement your own search component with React and use the Nuclia SDK to query the API.

Let’s create a new component in ./components/Search.js containing a minimal search input and a list of results (see full code example at the end of the article).

You can access your knowledge box with:

const kb = new Nuclia({
  backend: "",
  zone: "europe-1",
  knowledgeBox: "<YOUR-KB-ID>",

Note that you do not need an API key here because you are not going to create or update any resource (and actually you should never put a contributor key in your client code).

Then you can use the search() method to query the API from the onChange handler of your input field

const [query, setQuery] = useState("");
const [results, setResults] = useState([]);

const onChange = useCallback((e) => {
  const query =;
  setQuery(query); =>
      results?.sentences? => ({
        text: result.text,
        title: results.resources[result.rid]?.title || "No title",
      })) || []
}, []);

Note that the search() method returns an Observable so you need to subscribe to it to get the results. But Nuclia SDK also provides Promises via the asyncKnowledgeBox wrapper if you prefer.

Now you can display the results in your component:

<div className={styles.container}>
  {, index) => (
    <div key={`result-${index}`}>

Custom search form

Enriching the search results

In the previous step, you have just displayed the sentences returned by the Nuclia API. These sentences are the ones matching semantically your query (“semantically” means they have been picked by the Nuclia search engine not because they contains the query words but because their meaning is close to your query meaning).

But Nuclia offers a lot more information about the resources that can be useful to display in your search results. For example, it returns paragraphs (which are the fuzzy search results, so the results matching your query words even if they are mispelled or derivated); it returns named entities (a.k.a. NER) which are the important concepts mentioned in the resources (people, dates, places, organizations, etc.); it also returns relations between resources (based on their entities); thumbnails, and many other interesting metadata, and all of that is automatically extracted from your content as soon as it is pushed to the Nuclia API.

Let’s play a bit with the entities.

Change the onChange method that way:

const onChange = useCallback((e) => {
  const query =;
  setQuery(query);, [], {
    show: ["basic", "extracted"],
    extracted: ["text", "metadata"],
  }).subscribe((results) => {
    const sentences = results?.sentences? => ({
      text: result.text,
      rid: result.rid,
    const resultsByRID = (sentences || []).reduce((acc, result) => {
      if (!acc[result.rid]) {
        const resource = new ReadableResource(results?.resources[result.rid]);
        const ner = resource.getNamedEntities();
        acc[result.rid] = {
          title: resource.title,
          sentences: [result.text],
      } else {
      return acc;
    }, {});
}, []);

What’s new here?

  • First, you call the search() method with options so it retrieves metadata (because NERs are part of these metadata).
  • Then, you iterate over the sentences and group them by resource ID (because a resource can have multiple sentences matching your query).
  • For each resource, you create a ReadableResource instance (which is a wrapper around the raw resource data returned by the API) and you call the getNamedEntities() method to get the NERs of the resource.

Now you are able to display the NERs in your results!

Search results with NER


In this article, you have seen how to integrate Nuclia in a Next.js site just by copy/pasting the search widget snippet, how to use the Nuclia SDK to index markdown contents and their related links and media files in Nuclia, and how to implement a custom search component with React.

There are plenty of other exciting things you can do with Nuclia, so don’t hesitate to check the Nuclia documentation if you want to know more!

The full code example discussed here is available on GitHub.

Related articles

Nuclia’s latest articles and updates, right in your inbox

Pick up the topics you are the most interested in, we take care of the rest!