meilisearch部署原创

# 1. 部署

官方对于部署的介绍非常详细，各种方案都提供了，我这里选择使用 docker 来进行部署。

添加服务启动脚本start.sh到/tmp/scraper目录

docker run -itd --name meilisearch -p 7700:7700 --restart=always \
  -e MEILI_ENV="production" -e MEILI_NO_ANALYTICS=true \
  -e MEILI_MASTER_KEY="自定义一个不少于16字节的秘钥" \
  -v $(pwd)/meili_data:/meili_data \
  getmeili/meilisearch

1
2
3
4
5

自建的时候，需要将环境变量声明为生产，并且必须指定 master-key，否则将会提示无法使用。

然后运行该脚本，服务启动，通过监听日志，查看服务状态是否正常。

也可以请求服务的健康接口进行验证：

$ curl -s http://localhost:7700/health | jq
{
  "status": "available"
}

1
2
3
4

注意，生产模式下，只有这一个接口是不需要秘钥认证即可访问的，其他接口访问的时候都需要带上秘钥。

# 2. 创建搜索的key

上边有了一个 master-key 用于爬虫抓取使用，还需要创建一个只有搜索权限的 key，可通过如下命令进行创建search.sh到/tmp/scraper目录

curl \
  -X POST 'http://localhost:7700/keys' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer 你自定义的秘钥' \
  --data-binary '{
    "description": "xiaoying.org.cn key",
    "actions": ["search"],
    "indexes": ["blog"],  // 第四步建立索引抓取配置中的index_uid的值需与该值保持一致
    "expiresAt": "2099-01-01T00:00:00Z"
  }'

1
2
3
4
5
6
7
8
9
10

创建完成之后，能看到返回内容中有一个 key 的字段，就是这个只有搜索权限的 key 了。

# 3.添加域名

这个根据自己的实际情况，我这里给 Nginx 添加配置文件，配置域名：

server {
    listen 443 ssl;
    server_name xiaoying.org.cn;

    ssl_certificate /etc/ssl/certs/xiaoying.org.cn_bundle.crt;
    ssl_certificate_key /etc/ssl/certs/xiaoying.org.cn.key;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;

    location ^~ /indexes/ {
        proxy_set_header Host $host;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_set_header   X-Real-IP         $remote_addr;
        proxy_pass http://127.0.0.1:7700;
    }

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

这样就完成了与meilisearch一样的服务端配置信息：

服务端 URL(https://xiaoying.org.cn/)
master key(第一步自定义)
search key（第二步生成）

# 4. 建立索引

官方提供了爬虫工具，我们只需要进行简单的配置，即可将数据索引建立起来。

关于这段配置流程，官方文档同样给了详细的说明：抓取你的内容 (opens new window) (opens new window)。

官方提供了一个 Vuepress 的抓取配置config.json如下：

{
  "index_uid": "blog",
  "sitemap_urls": ["https://xiaoying.org.cn/sitemap.xml"],
  "start_urls": ["https://xiaoying.org.cn"],
  "selectors": {
    "lvl0": {
      "selector": ".sidebar-heading.open",
      "global": true,
      "default_value": "Documentation"
    },
    "lvl1": ".theme-default-content h1",
    "lvl2": ".theme-default-content h2",
    "lvl3": ".theme-default-content h3",
    "lvl4": ".theme-default-content h4",
    "lvl5": ".theme-default-content h5",
    "text": ".theme-default-content p, .theme-default-content li, .theme-default-content td"
  },
  "strip_chars": " .,;:#",
  "scrap_start_urls": true,
  "custom_settings": {
    "synonyms": {
      "relevancy": ["relevant", "relevance"],
      "relevant": ["relevancy", "relevance"],
      "relevance": ["relevancy", "relevant"]
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

注意如上的配置内容很重要，如果你的博客不是常规默认的，那么需要根据自己的情况对元素进行辨别，详细配置项说明，参考更多可选字段 (opens new window) (opens new window)。

本项目用的是 vdoing 主题，可以看到一些元素名字与内容不一样，需要一些调整。所以我用的配置如下：

{
    "index_uid": "blog",
    "sitemap_urls": ["https://xiaoying.org.cn/sitemap.xml"],
    "start_urls": ["https://xiaoying.org.cn/"],
    "stop_urls": [
	     "https://xiaoying.org.cn/HowToStartOpenSource/archives/"
    ],
    "selectors": {
        "lvl0": {
            "selector": ".sidebar-heading.open",
            "global": true,
            "default_value": "Documentation"
        },
        "lvl1": ".theme-vdoing-content h2",
        "lvl2": ".theme-vdoing-content h3",
        "lvl3": ".theme-vdoing-content h4",
        "lvl4": ".theme-vdoing-content h5",
        "lvl5": ".theme-vdoing-content h6",
        "text": ".theme-vdoing-content p, .theme-vdoing-content li"
    },
    "strip_chars": " .,;:#",
    "scrap_start_urls": true,
    "selectors_exclude": ["iframe", ".katex-block", ".md-flowchart", ".md-mermaid", ".md-presentation.reveal.reveal-viewport", ".line-numbers-mode", ".code-group", ".footnotes", "footer.page-meta", ".page-nav", ".comments-wrapper"]
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

index_uid ：为索引名称，如果服务端没有，则会自动创建，需与第二步的indexes保持一致。

新建do.sh如下对内容进行抓取：

docker run -t --rm \
  --network=host \
  -e MEILISEARCH_HOST_URL='http://localhost:7700' \
  -e MEILISEARCH_API_KEY='第一步自定义的Master Key' \
  -v /tmp/scraper/config.json:/docs-scraper/config.json \
  getmeili/docs-scraper pipenv run ./docs_scraper config.json

1
2
3
4
5
6

将config.json与do.sh放到/tmp/scraper目录下，然后通过如下命令运行爬虫对内容进行抓取：

sh do.sh

提示

如果脚本跑完发现最后匹配到了 0 条，可能是上边 config.json 中元素选择的问题，可以到自己博客中，点击检查来查看元素的正确名称。

# 5. 配置客户端

客户端的配置就相对简单了，因为 meilisearch 的官方文档用的也是 Vuepress，因此官方也维护了一个 Vuepress 的插件，基本上就是开箱即用，下边安装配置直接进行，不多废话。

插件地址：vuepress-plugin-meilisearch(opens new window) (opens new window)
说明文档：README(opens new window) (opens new window)

# 安装插件

yarn add vuepress-plugin-meilisearch
# or
npm install vuepress-plugin-meilisearch

1
2
3

# 配置插件

在插件的配置文件当中添加如下配置内容：

// 全文搜索插件 meilisearch
  [
    'vuepress-plugin-meilisearch',
      {
          hostUrl: 'https://xiaoying.org.cn/',  // 服务器防火墙开7700端口
          apiKey: "",   // 第二步生成的只有搜索权限的 key
          indexUid: '', // 第四步config.json里的index_uid
          placeholder: '搜索一下，你就知道',   // 在搜索栏中显示的占位符
          maxSuggestions: 30,                      // 最多显示几个搜索结果
          cropLength: 50,                         // 每个搜索结果最多显示多少个字符
      },
  ],

1
2
3
4
5
6
7
8
9
10
11
12

如此，便就完成了 Vuepress 博客对 meilisearch 搜索的接入。

# 6. 索引自动化

当我们有新的文章发布时，应该重新运行抓取文章建立索引的命令，如果你的博客是通过 Github Action 进行发布的，那么官方还提供了通过 Action 自动抓取的方案。

首先在项目根目录下新建.github/workflows/crawler.yml文件，内容如下：

name: Auto Crawler

on:
  schedule:
    - cron: '0 9 1/7 * *' //每过七天，早上九点开始自动爬取内容

jobs:
  crawler:
    runs-on: ubuntu-latest
    steps:
      - name: ssh docker login   
        uses: appleboy/ssh-action@v1.0.0
        with:
          host: ${{ secrets.TENCENT_CLOUD_IP }} 
          username: ${{ secrets.TENCENT_CLOUD_NAME }} 
          password: ${{ secrets.TENCENT_CLOUD_PASSWORD }} 
          script:  cd /tmp/scraper && sh do.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

如上内容需要依赖以下配置信息：

host：云服务器IP
username：云服务器登录用户名，一般为root
password：云服务器登录密码

以上三个配置需到对应的github仓库填写secret
script：登录之后执行的指令，此处就是重新爬取内容

如上内容准备完毕之后，当我们提交了新的代码，部署上去之后，就会自动运行抓取内容

# 7. vuepress搜索结果报错处理

默认搜索结果跳转会报如下错误

原因是因为中文被转义而找不到对应的选择器，只需修改源码将中文解码即可

// vuepress-plugin-smooth-scroll/lib/enhanceApp.js
export default ({ Vue, router }) => {
    router.options.scrollBehavior = (to, from, savedPosition) => {
        if (savedPosition) {
            return window.scrollTo({
                top: savedPosition.y,
                behavior: 'smooth',
            });
        }
        else if (to.hash) {
            if (Vue.$vuepress.$get('disableScrollBehavior')) {
                return false;
            }
          
            // 解码并清理 to.hash 值以去除特殊字符
            const decodedHash = decodeURIComponent(to.hash);
            const sanitizedHash = decodedHash.replace(/[^\w\u4e00-\u9fa5-]/g, '');
            const targetElement = document.querySelector('#' + sanitizedHash.replace(/%/g, '\\%'));
            
           //执行跳转逻辑
            if (targetElement) {
                return window.scrollTo({
                    top: getElementPosition(targetElement).y,
                    behavior: 'smooth',
                });
            }
            return false;
        }
        else {
            return window.scrollTo({
                top: 0,
                behavior: 'smooth',
            });
        }
    };
};

function getElementPosition(el) {
    const docEl = document.documentElement;
    const docRect = docEl.getBoundingClientRect();
    const elRect = el.getBoundingClientRect();
    return {
        x: elRect.left - docRect.left,
        y: elRect.top - docRect.top,
    };
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

修改完成后使用 patch-package (opens new window) 修改，以免下次运行时依赖还原

yarn patch-package vuepress-plugin-smooth-scroll

#工具类 #meilisearch

上次更新: 2025/02/18 14:46:10

← pandoc使用 cloudreve部署→