本页导航

👤 华总 📅 2025-08-24 📂 vitepress 📖 1362 ⏱ 1m 👁 ⏲

配置algolia

本文介绍了配置Algolia搜索服务的完整流程。主要包括注册账号并创建应用、新建索引、验证域名、配置爬虫以抓取网站正文和代码块内容、设置索引参数以启用高级搜索功能，以及最后获取必要的代码配置信息。文章以分步图示的方式，指导用户完成从初始化到集成的各个环节。

_1、注册账号

首先需要去 algolia 官网注册自己的账号，可以直接使用 Github 或者其他邮箱注册登录。

新账号会自动创建一个Application ，也可以自己创建一个新的

点击确定后接着下一个页面继续点击create Application按钮，然后点击NEXT就创建了一个新的Application，创建完成后点击 Skip for now，不然会根据域名生成index名称，爬虫配置也不对

_2、新建index

按下图步骤创建index

_3、验证域名

按要求验证域名

然后点击Skip for now进行下一步骤

_4、配置爬虫

按下图所示新建爬虫

新建后点击爬虫名称进入爬虫配置

爬虫配置如下，首先查看自动生成的配置，保存appId和apiKey填写到下方，然后将域名和indexName改成自己的配置，复制到algolia代码框中，配置完成后点击右上角start Crawling开始爬取网站内容

      new Crawler({
  appId: "",
  apiKey: "",
  indexPrefix: "",
  rateLimit: 8,
  startUrls: ["https://xiaoying.org.cn/"],
  renderJavaScript: false,
  sitemaps: ["https://xiaoying.org.cn/sitemap.xml"],
  exclusionPatterns: [],
  ignoreCanonicalTo: true,
  discoveryPatterns: ["https://xiaoying.org.cn/**"],
  schedule: "on the first day of the week",
  actions: [
    {
      indexName: "hugos",
      pathsToMatch: ["https://xiaoying.org.cn/**"],
      recordExtractor: ({ url, $, helpers }) => {
        // 1. 提取标题和基础内容（用于 splitContentIntoRecords）
        const baseRecord = {
          url,
          title: $("head title").text().trim(),
        };

        const $bodyClone = $("body").clone();
        // 移除代码块，以免内部的代码被解析
        $bodyClone.find("pre, code").remove();

        const splitRecords = helpers.splitContentIntoRecords({
          baseRecord,
          $elements: $bodyClone,
          maxRecordBytes: 1000,
          textAttributeName: "text",
          orderingAttributeName: "part",
        });

        // 2. 抽取代码片段
        const code = helpers.codeSnippets({
          tag: "pre",
          languageClassPrefix: "language-",
        });

        // 3. DocSearch 风格结构化内容
        let lvl0 = "Documentation";
        const breadcrumbs = [];
        $("#breadcrumbs li.breadcrumb-item").each((i, el) => {
          const name = $(el).find('[itemprop="name"]').text().trim();
          breadcrumbs.push(name);
        });
        if (breadcrumbs.length >= 2) lvl0 = breadcrumbs[breadcrumbs.length - 2];

        $(".docs-content h2 i").remove();

        const docsearchRecords = helpers.docsearch({
          aggregateContent: true,
          indexHeadings: true,
          recordVersion: "v3",
          recordProps: {
            lvl0: { selectors: "", defaultValue: lvl0 },
            lvl1: ".docs-content h1",
            lvl2: ".docs-content h2",
            lvl3: ".docs-content h3",
            lvl4: ".docs-content h4",
            lvl5: ".docs-content h5",
            content: ".main-content p, .main-content li",
          },
        });

        // 合并所有提取内容为单一数组
        return [
          ...splitRecords,
          ...docsearchRecords,
          ...(code.code ? [{ code: code.code }] : []),
        ];
      },
    }
  ],
  initialIndexSettings: {
    hugos: {
      attributesForFaceting: ["type", "lang", "chunkIndex", "totalChunks"],
      attributesToRetrieve: [
        "hierarchy.lvl0",
        "hierarchy.lvl1",
        "hierarchy.lvl2",
        "hierarchy.lvl3",
        "hierarchy.lvl4",
        "hierarchy.lvl5",
        "hierarchy.lvl6",
        "content",
        "anchor",
        "url",
        "chunkIndex",
        "totalChunks",
      ],
      attributesToHighlight: ["hierarchy", "content"],
      attributesToSnippet: ["content:20"],
      searchableAttributes: [
        "unordered(hierarchy.lvl0)",
        "unordered(hierarchy.lvl1)",
        "unordered(hierarchy.lvl2)",
        "unordered(hierarchy.lvl3)",
        "unordered(hierarchy.lvl4)",
        "unordered(hierarchy.lvl5)",
        "unordered(hierarchy.lvl6)",
        "unordered(anchor)",
        "content",
      ],
      distinct: true,
      attributeForDistinct: "url",
      customRanking: [
        "desc(weight.pageRank)",
        "desc(weight.level)",
        "asc(weight.position)",
        "asc(chunkIndex)",
      ],
      ranking: [
        "words",
        "filters",
        "typo",
        "attribute",
        "proximity",
        "exact",
        "custom",
      ],
      highlightPreTag: '<span class="algolia-highlight">',
      highlightPostTag: "</span>",
      minWordSizefor1Typo: 3,
      minWordSizefor2Typos: 7,
      allowTyposOnNumericTokens: false,
      minProximity: 1,
      ignorePlurals: true,
      advancedSyntax: true,
      removeWordsIfNoResults: "allOptional",
    }
  },
});

如果需要AI搜索可以添加如下配置，然后将域名和indexName改成自己的配置

      // 上方actions内添加

{
      indexName: "hugos-md",
      pathsToMatch: ["https://hugo.xiaoying.org.cn/pages/**"],
      recordExtractor: ({ $, url, helpers }) => {
        // Target only the main content, excluding navigation
        const text = helpers.markdown("main");
        if (text === "") return [];

        const language = $("html").attr("lang") || "en";
        const title = $("head > title").text();

        // Get the main heading for better searchability
        const h1 = $(".docs-content h1").first().text();

        return helpers.splitTextIntoRecords({
          text,
          baseRecord: {
            url,
            objectID: url,
            title: title || h1,
            heading: h1, // Add main heading as separate field
            lang: language,
          },
          maxRecordBytes: 8000,
          orderingAttributeName: "part",
        });
      },
 }

      // 上方initialIndexSettings内添加 
"hugos-md": {
      attributesForFaceting: ["type", "lang"],
      ignorePlurals: true,
      minProximity: 1,
      indexLanguages: ["zh"],
      queryLanguages: ["zh"],
      distinct: true,
      attributeForDistinct: "url",
      removeStopWords: false,
      searchableAttributes: ["title", "heading", "unordered(text)"],
      removeWordsIfNoResults: "lastWords",
      attributesToHighlight: ["title", "text"],
      typoTolerance: false,
      advancedSyntax: false,
 },

添加AI页面位置如下图，按要求添加即可

_5、索引设置

回到搜索页面看是否有数据

接着配置索引，选择要搜索的内容

接着配置facets，这是实现高级搜索和筛选功能的核心特性之一，主要作用是帮助用户快速缩小搜索范围，提升搜索体验，这里要重点注意 lang必须被选择，否则网页搜索为空

_6、代码配置

按图所示找到下面的配置

      export default defineConfig({
   ...
    lang: "zh-CN",
   ...
    themeConfig: ({
      ...
        search: {
            provider: 'algolia',
            options: {
                appId: '...',
                apiKey: '...',
                indexName: '...',
                askAi: {
                  indexName: '', 
                  assistantId: '上面获取到的Assistant ID'
               },
            },
        } 
      ...
    })
})

声明

作者： liyao

版权：本博客所有文章除特别声明外，均采用CCBY-NC-SA4.O许可协议。转载请注明!

链接： https://xiaoying.org.cn/pages/20b6a7/

DOM操作与全局JS注册指南

本文介绍了DOM操作与全局JavaScript注册的核心技术 …

配置algolia

_1、注册账号 link

_2、新建index link

_3、验证域名 link

_4、配置爬虫 link

_5、索引设置 link

_6、代码配置 link