本页导航
article
配置algolia
AI摘要
本文介绍了配置Algolia搜索服务的完整流程。主要包括注册账号并创建应用、新建索引、验证域名、配置爬虫以抓取网站正文和代码块内容、设置索引参数以启用高级搜索功能,以及最后获取必要的代码配置信息。文章以分步图示的方式,指导用户完成从初始化到集成的各个环节。
_1、注册账号
首先需要去 algolia 官网注册自己的账号,可以直接使用 Github 或者其他邮箱注册登录。
新账号会自动创建一个Application ,也可以自己创建一个新的


点击确定后接着下一个页面继续点击create Application按钮,然后点击NEXT就创建了一个新的Application,创建完成后点击
Skip for now,不然会根据域名生成index名称,爬虫配置也不对

_2、新建index
按下图步骤创建index

_3、验证域名
按要求验证域名

然后点击Skip for now进行下一步骤
_4、配置爬虫
按下图所示新建爬虫

新建后点击爬虫名称进入爬虫配置

爬虫配置如下,首先查看自动生成的配置,保存appId和apiKey填写到下方,然后将域名和indexName改成自己的配置,复制到algolia代码框中,配置完成后点击右上角start Crawling开始爬取网站内容
new Crawler({
appId: "",
apiKey: "",
indexPrefix: "",
rateLimit: 8,
startUrls: ["https://xiaoying.org.cn/"],
renderJavaScript: false,
sitemaps: ["https://xiaoying.org.cn/sitemap.xml"],
exclusionPatterns: [],
ignoreCanonicalTo: true,
discoveryPatterns: ["https://xiaoying.org.cn/**"],
schedule: "on the first day of the week",
actions: [
{
indexName: "hugos",
pathsToMatch: ["https://xiaoying.org.cn/**"],
recordExtractor: ({ url, $, helpers }) => {
// 1. 提取标题和基础内容(用于 splitContentIntoRecords)
const baseRecord = {
url,
title: $("head title").text().trim(),
};
const $bodyClone = $("body").clone();
// 移除代码块,以免内部的代码被解析
$bodyClone.find("pre, code").remove();
const splitRecords = helpers.splitContentIntoRecords({
baseRecord,
$elements: $bodyClone,
maxRecordBytes: 1000,
textAttributeName: "text",
orderingAttributeName: "part",
});
// 2. 抽取代码片段
const code = helpers.codeSnippets({
tag: "pre",
languageClassPrefix: "language-",
});
// 3. DocSearch 风格结构化内容
let lvl0 = "Documentation";
const breadcrumbs = [];
$("#breadcrumbs li.breadcrumb-item").each((i, el) => {
const name = $(el).find('[itemprop="name"]').text().trim();
breadcrumbs.push(name);
});
if (breadcrumbs.length >= 2) lvl0 = breadcrumbs[breadcrumbs.length - 2];
$(".docs-content h2 i").remove();
const docsearchRecords = helpers.docsearch({
aggregateContent: true,
indexHeadings: true,
recordVersion: "v3",
recordProps: {
lvl0: { selectors: "", defaultValue: lvl0 },
lvl1: ".docs-content h1",
lvl2: ".docs-content h2",
lvl3: ".docs-content h3",
lvl4: ".docs-content h4",
lvl5: ".docs-content h5",
content: ".main-content p, .main-content li",
},
});
// 合并所有提取内容为单一数组
return [
...splitRecords,
...docsearchRecords,
...(code.code ? [{ code: code.code }] : []),
];
},
}
],
initialIndexSettings: {
hugos: {
attributesForFaceting: ["type", "lang", "chunkIndex", "totalChunks"],
attributesToRetrieve: [
"hierarchy.lvl0",
"hierarchy.lvl1",
"hierarchy.lvl2",
"hierarchy.lvl3",
"hierarchy.lvl4",
"hierarchy.lvl5",
"hierarchy.lvl6",
"content",
"anchor",
"url",
"chunkIndex",
"totalChunks",
],
attributesToHighlight: ["hierarchy", "content"],
attributesToSnippet: ["content:20"],
searchableAttributes: [
"unordered(hierarchy.lvl0)",
"unordered(hierarchy.lvl1)",
"unordered(hierarchy.lvl2)",
"unordered(hierarchy.lvl3)",
"unordered(hierarchy.lvl4)",
"unordered(hierarchy.lvl5)",
"unordered(hierarchy.lvl6)",
"unordered(anchor)",
"content",
],
distinct: true,
attributeForDistinct: "url",
customRanking: [
"desc(weight.pageRank)",
"desc(weight.level)",
"asc(weight.position)",
"asc(chunkIndex)",
],
ranking: [
"words",
"filters",
"typo",
"attribute",
"proximity",
"exact",
"custom",
],
highlightPreTag: '<span class="algolia-highlight">',
highlightPostTag: "</span>",
minWordSizefor1Typo: 3,
minWordSizefor2Typos: 7,
allowTyposOnNumericTokens: false,
minProximity: 1,
ignorePlurals: true,
advancedSyntax: true,
removeWordsIfNoResults: "allOptional",
}
},
});
如果需要AI搜索可以添加如下配置,然后将域名和indexName改成自己的配置
// 上方actions内添加
{
indexName: "hugos-md",
pathsToMatch: ["https://hugo.xiaoying.org.cn/pages/**"],
recordExtractor: ({ $, url, helpers }) => {
// Target only the main content, excluding navigation
const text = helpers.markdown("main");
if (text === "") return [];
const language = $("html").attr("lang") || "en";
const title = $("head > title").text();
// Get the main heading for better searchability
const h1 = $(".docs-content h1").first().text();
return helpers.splitTextIntoRecords({
text,
baseRecord: {
url,
objectID: url,
title: title || h1,
heading: h1, // Add main heading as separate field
lang: language,
},
maxRecordBytes: 8000,
orderingAttributeName: "part",
});
},
}
// 上方initialIndexSettings内添加
"hugos-md": {
attributesForFaceting: ["type", "lang"],
ignorePlurals: true,
minProximity: 1,
indexLanguages: ["zh"],
queryLanguages: ["zh"],
distinct: true,
attributeForDistinct: "url",
removeStopWords: false,
searchableAttributes: ["title", "heading", "unordered(text)"],
removeWordsIfNoResults: "lastWords",
attributesToHighlight: ["title", "text"],
typoTolerance: false,
advancedSyntax: false,
},
添加AI页面位置如下图,按要求添加即可

_5、索引设置
回到搜索页面看是否有数据

接着配置索引,选择要搜索的内容

接着配置facets,这是实现高级搜索和筛选功能的核心特性之一,主要作用是帮助用户快速缩小搜索范围,提升搜索体验,这里要重点注意
lang必须被选择,否则网页搜索为空

_6、代码配置
按图所示找到下面的配置



export default defineConfig({
...
lang: "zh-CN",
...
themeConfig: ({
...
search: {
provider: 'algolia',
options: {
appId: '...',
apiKey: '...',
indexName: '...',
askAi: {
indexName: '',
assistantId: '上面获取到的Assistant ID'
},
},
}
...
})
})
