1 问题

在项目中我们用到了多级缓存(caffeine + redis)，为了提高localcache的命中率，我们希望某些参数相同的请求能打到相同的机器。

1.1 问题分析

我们是内部服务，无法使用 ipHash
因为请求不一定都带有userId/buildingId/guestCode等参数，需要根据不同url来配置不同hash策略
端上/网关没有做统一公参的处理

典型的需求case如下表所示：

URI	入参	hash方式
/room/room_info	buildingId, roomCode, needOwner	buildingId + roomCode
/room/guest_list	buildingId, roomCode	buildingId + roomCode
/guest/last_lesson	guestCode	guestCode
/owner/virtual_room	buildingId，ownerCode	buildingId + ownerCode
/lessons_date	userId，date	userId
/owner_info	userId，userIds	不hash
/room/virtual_total	vRoomCode	不hash

2 方案探讨

个人想到两种方案：

2.1 应用分发

思路

单起一个应用A，专门来做请求分发。
假设后端应用B有N台服务器，应用B的每台服务器都单独订阅一个消息队列的topic。
应用A收到请求后，根据需求解析参数，一致性hash计算出应该需要应用B的哪台机器来处理，然后将请求发送到对应的topic。
应用B的服务器从对应的topic订阅请求消息并解析处理。

特点

适用场景：快速应答(应用A) + 异步处理复杂逻辑(消息队列 + 应用B)
优点：研发侧方便管控；
缺点：需要单独部署服务；异步处理可能需要在业务侧增加额外状态来表示处理状态

2.2 nginx分发

思路

nginx支持lua扩展，同时也支持一致性hash。可以用lua来解析请求并根据需求判断是一致性hash分发还是轮询分发。默认轮询分发；如果需要一致性hash分发，根据实际需求计算出hashKey，按hashKey分发即可。

特点

优点：适用于大部分场景，无需部署单独服务，对应用无侵入
缺点：lua语法需要适应，nginx性能会略有下降，需要运维侧配合实施
因为每个请求在nginx层都要增加一层逻辑，nginx性能会略有下降；如果不是对nginx吞吐量有特别高的要求，一般问题不大。
需要特别注意nginx版本/nginx各个模块/luajit的兼容性问题。

我们的应用采用nginx分发的方案。

3 环境安装

安装lua

参考：http://www.lua.org/download.html
curl -R -O http://www.lua.org/ftp/lua-5.4.0.tar.gz
tar zxf lua-5.4.0.tar.gz
cd lua-5.4.0
make all test
make install

下载nginx及各种插件
为了避免nginx的兼容性等问题，我们使用openresty版本, openresty自带了lua扩展。
下载openresty: https://openresty.org/en/download.html 。我是用的是 openresty-1.11.2.5 。
下载ngix一致性hash插件: https://github.com/replay/ngx_http_consistent_hash

安装依赖

brew update
brew install pcre openssl curl

安装nginx及插件

./configure --with-cc-opt="-I/usr/local/opt/openssl/include/ -I/usr/local/opt/pcre/include/" --with-ld-opt="-L/usr/local/opt/openssl/lib/ -L/usr/local/opt/pcre/lib/" -j2 --add-module=…/pkgs/ngx_http_consistent_hash-master //（-j2 2为核数）
sudo make -j2
sudo make install

检测是否安装成功：

/usr/local/openresty/nginx/sbin/nginx -V

nginx -V信息展示
默认conf文件位置： /usr/local/openresty/nginx/conf/nginx.conf

我们可以执行下面命令来简化nginx执行：

sudo ln -s /usr/local/openresty/nginx/sbin/nginx /usr/sbin/nginx

4 nginx常用命令

nginx // 直接输入nginx命令启动
nginx -t //检测是否可以正常启动
nginx -s reload //重启
nginx -s stop //停止nginx进程
nginx -t -c /usr/local/nginx/conf/nginx.conf // 检测指定配置文件语法是否正确

5 代码开发

修改 nginx.conf:

    server {
        listen       80;
        server_name  localhost;
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

        location / {
	        set $hashkey "";
	        set $backendupstream "rrbackend";
	        // lua脚本设置backendupstream和hashkey
            rewrite_by_lua_file '../set_upstream.lua';
            proxy_pass   http://$backendupstream;
        }
    }

    // 轮询
    upstream rrbackend {
      // 模拟4台后端服务器
      server 127.0.0.1:8881;
      server 127.0.0.1:8882;
      server 127.0.0.1:8883;
      server 127.0.0.1:8884;
    }
    // 一致性hash
    upstream hashbackend {
      consistent_hash $hashkey;
      server 127.0.0.1:8881;
      server 127.0.0.1:8882;
      server 127.0.0.1:8883;
      server 127.0.0.1:8884;
    }

我启动了4个shell窗口，每个shell窗口启动一个server，端口分别指定为8881-8884，这样观察那个shell有日志输出就可以知道请求最终落到了哪个服务。

上面 nginx 配置逻辑比较简单，模拟了轮询和一致性 hash 两组 upstream。
所有的请求都会通过 set_upstream.lua 来计算 backendupstream 和 hashkey，然后转发请求到 backendupstream 对应的分组。如果请求被转发到一致性 hash 分组，会通过 hashkey 来计算一致性 hash ，并将请求最终打到hash值对应的后端机器。

主体处理逻辑 set_upstream.lua

-- 入参：$hashkey, $backendupstream
-- 如果是一致性hash，会set $hashkey
-- $backendupstream 表示将会采用的upstream

-- 黑名单配置；所有黑名单url不走一致性hash
local blackList = {
    "/owner/towner_info",
    "/room/virtual_total"
}

-- 判断url是否在黑名单
function inBlackList(url)
    for i,v in pairs(blackList) do
        if url == blackList[i] then
            return 1
        end
    end
    return 0
end

-- 判断字符串是否为空
function isBlank(str)
    return str == nil or str == ""
end

-- 获取请求 header 里面的 Content-Type
function getContentType()
    local h, err = ngx.req.get_headers()
    if err == "truncated" then
        -- one can choose to ignore or reject the current request here
    end
    return ngx.req.get_headers()["Content-Type"]
end

---- 下面是主体逻辑 ----
-- 如果请求url在黑名单，直接返回，使用轮询
if inBlackList(ngx.var.uri) == 1 then
    ngx.log(ngx.INFO, "useRR for blackList. ")
    ngx.var.backendupstream = "rrbackend"
    return
end

-- 获取请求的参数列表 args（类型为 table）
local args = nil
local err = nil
local requestMethod = ngx.var.request_method
if ("GET" == requestMethod) then
    args = ngx.req.get_uri_args()
elseif ("POST" == requestMethod) then
    ngx.req.read_body()
    local contentType = getContentType()
    if contentType == "application/json" then
        -- 对 application/json, get_post_args 获取的table形式为 ["a=1&b=11" : true], 所以需要我们自己解析body字符串
        local argStr = ngx.req.get_body_data()
        local json = require("cjson")
        local jsonObj = json.decode(argStr)
        args = jsonObj
    elseif contentType == "application/x-www-form-urlencoded" then
        -- 对 application/x-www-form-urlencoded, get_post_args 获取的table内容形式为 ["a": 1, "b": 11]
        args,err = ngx.req.get_post_args()
        if err == "truncated" then
            -- 考虑到我们的请求都比较小，忽略这种情况的处理；如果万一有的话，使用轮询
            ngx.log(ngx.INFO, "post.err:", err)
            ngx.var.backendupstream = "rrbackend"
            return
        end
    else
        -- 对 multipart/form-data && text/plain，我们的业务场景可以忽略这种情况的处理；如果万一有的话，使用轮询
        ngx.var.backendupstream = "rrbackend"
        return
    end
end

-- 如果获取不到请求参数，默认使用轮询
if not args then
    ngx.log(ngx.INFO, "useRR: get blank args")
    ngx.var.backendupstream = "rrbackend"
    return
end

-- 获取业务参数
local buildingId = args["buildingId"]
local userId = args["userId"]
local ownerCode = args["ownerCode"]
local guestCode = args["guestCode"]
local roomCode = args["roomCode"]

-- 如果不需要一致性hash，直接使用轮询
if (isBlank(userId) and isBlank(guestCode) and isBlank(ownerCode) and isBlank(roomCode)) then
    ngx.log(ngx.INFO, "useRR for allBlankSpeical")
    ngx.var.backendupstream = "rrbackend"
    return
end

-- 拼装一致性hash的key
local key = "buildingId"..tostring(buildingId).."roomCode"..tostring(roomCode).."userId"..tostring(userId).."ownerCode"..tostring(ownerCode).."guestCode"..tostring(guestCode);

-- 设置返回值
ngx.var.hashkey = key
ngx.var.backendupstream = "hashbackend"
ngx.log(ngx.INFO, "backendupstream=", ngx.var.backendupstream, ", key=", key)

上面lua代码主体逻辑比较清晰：从请求中获取相应的参数，如果需要一致性hash，就计算一致性hashkey。
需要注意的是 post 类型请求 Content-Type 不同时，对参数的处理不太一样。

6 自测方法

这里只给出主体逻辑的测试方法。

重启nginx配置： nginx -s reload
发起多次同一参数的请求：http://localhost/room/room_info?buildingId=1&roomCode=14
观察日志，会发现多次请求都会打到同一个server上。
去掉roomCode参数多次请求：http://localhost/room/room_info?buildingId=1
会发现请求会轮询到每一个server上。

7 性能分析

todo
目前项目还没有上线，准备在压测过程中上线该功能，对比上线前后的性能指标来进行分析。
另外，结合我们的原始需求：提升本地缓存命中率，我们也会结合本地缓存一起来分析整体性能。

8 常见问题

已经安装了openresty，怎么安装一致性hash模块

一般nginx第三方模块的安装方法：

cd /nginxPath/openresty-1.11.2.5 // 切换到nginx源码目录
./configure --prefix=/你的安装目录 --add-module=/第三方模块目录 // --prefix不指定的话会使用默认配置

如果需要安装多个第三方模块，可以指定多个–add-module 。

已安装openresty，安装 ngx_http_consistent_hash 模块：

cd /nginxPath/openresty-1.11.2.5 // 切换到nginx源码目录
./configure --with-cc-opt="-I/usr/local/opt/openssl/include/ -I/usr/local/opt/pcre/include/" --with-ld-opt="-L/usr/local/opt/openssl/lib/ -L/usr/local/opt/pcre/lib/" -j2 --add-module=/Users/mrpp/pkgs/ngx_http_consistent_hash-master // 重新配置
make // 重新make，不要 make install
nginx -s stop // 停止nginx
sudo cp build/nginx-1.11.2/objs/nginx /usr/local/openresty/nginx/sbin/nginx // 安装新nginx，替换旧nginx
/usr/local/openresty/nginx/sbin/nginx // 启动nginx

lua里面使用了ngx.log没看到日志输出

在 nginx.conf 里面找到 error_log 的相关配置，一般默认在文件的最上面。类似：

error_log logs/error.log notice;

该命令的第一个参数指定了 error.log 的位置，默认的相对路径是基于nginx的安装目录，比如我的 error.log 位置： /usr/local/openresty/nginx/logs 。
该命令的第二个参数指定了日志级别，取值可以是：debug、info、notice、warn、error、crit、alert、emerg 。大于等于指定的级别的日志才会被写入 error.log 中，默认值是 error。
找到配置文件；修改 error_log 指定的日志级别，确保级别足够显示您的输出；找到 error.log 的具体位置，就可以看到日志输出了。

lua不能执行 ngx.req.read_body() 方法

当使用 set_by_lua / set_by_lua_file 在post请求获取请求参数时，可能会发生如下错误：

failed to run set_by_lua: set_by_lua:6: API disabled in the context of set_by_lua
stack traceback: [C]: in function ‘read_body’ set_by_lua:6: in function <set_by_lua:1>

原因：Nginx 是为了解决负载均衡场景诞生的，为了保证性能，它默认不读取 body，OpenResty 对 body 的处理也比较谨慎。
ngx.req.get_post_args()只能在rewrite_by_lua, access_by_lua, content_by_lua*阶段使用，且在使用前需要先调用ngx.req.read_body()，或打开
lua_need_request_body 选项强制本模块读取请求体（此方法不推荐）。
解决方式: 可以使用 rewrite_by_lua / rewrite_by_lua_file 。
更多内容参考：openResty中获取请求 body

如何获取multipart/form-data的body参数列表

对于 Content-Type 为 multipart/form-data 的 post 请求，我们需要自己去根据http协议解析body中的 boundary 及 kv参数。

参考：lua在web开发中获取GET或POST参数

post请求body过大, ngx.req.get_post_args()取不到参数

参考：post请求体过大导致ngx.req.get_post_args()取不到参数体的问题

lua脚本里面如何设置/返回多个值

可以在 nginx 中定义多个变量 $yourVariable ,
lua脚本中设置相应的 ngx.var.yourVariable 就可以了

lua中如何打印table类型

参考： https://blog.csdn.net/mixi57/article/details/51697563

9 扩展阅读

nginx源码目录结构说明

下图引用来源

.
├── auto 自动检测系统环境以及编译相关的脚本
│ ├── cc 关于编译器相关的编译选项的检测脚本
│ ├── lib nginx编译所需要的一些库的检测脚本
│ ├── os 与平台相关的一些系统参数与系统调用相关的检测
│ └── types 与数据类型相关的一些辅助脚本
├── conf 存放默认配置文件，在make install后，会拷贝到安装目录中去
├── contrib 存放一些实用工具，如geo配置生成工具（geo2nginx.pl）
├── html 存放默认的网页文件，在make install后，会拷贝到安装目录中去
├── man nginx的man手册
└── src 存放nginx的源代码
├── core nginx的核心源代码，包括常用数据结构的定义，以及nginx初始化运行的核心代码如main函数
├── event 对系统事件处理机制的封装，以及定时器的实现相关代码
│ └── modules 不同事件处理方式的模块化，如select、poll、epoll、kqueue等
├── http nginx作为http服务器相关的代码
│ └── modules 包含http的各种功能模块
├── mail nginx作为邮件代理服务器相关的代码
├── misc 一些辅助代码，测试c++头的兼容性，以及对google_perftools的支持
└── os 主要是对各种不同体系统结构所提供的系统函数的封装，对外提供统一的系统调用接口

nginx模块化思想

参考: Nginx模块化思想

nginx请求处理流程

参考：nginx的请求处理
其中理解重点为：多进程模型、HTTP Request的11个阶段(phase)、phase hanlder 的核心流程、 filter 模块链。

nginx多进程模型

参考：Nginx 原理和架构｜原力计划

Nginx 如何实现高性能

基于异步及非阻塞的事件驱动模型

参考：各种I/O模型详解

多进程模型

1 常见的web服务器对IO并发处理的三种方式：多进程、多线程、异步。
参考：三种工作模型比较
2 对比 memcached 多线程模型(底层采用 libevent)
todo，会发现非常相似

内存池设计

参考：nginx内存池实现原理

其他各种细节处理

todo：sendfile，mmap，共享内容(slab算法)，buffer管理(重用机制/防拷贝机制)，chain管理(chain重用机制)，锁处理，时间缓存/文件缓存，log机制…

nginx与lua协程 / nginx为何选择lua

参考：
Nginx与Lua
lua-nginx-module官方文档

nginx+lua实现按参数一致性哈希分发