简述

需求:OpenStack跑在Kubernetes上,OpenStack需要配置为IPv6网络。

Kubernetes IPv4/IPv6 双协议栈能够将 IPv4 和 IPv6 地址分配给 Pod 和 Service。

支持的功能

在 Kubernetes 集群上启用 IPv4/IPv6 双协议栈可提供下面的功能:

  • 双协议栈 pod 网络 (每个 pod 分配一个 IPv4 和 IPv6 地址)
  • IPv4 和 IPv6 启用的服务 (每个服务必须是一个单独的地址族)
  • Pod 的集群外出口通过 IPv4 和 IPv6 路由

环境信息

  • 系统:CentOS Linux release 7.7.1908 (Core)
  • Kubernetes版本:v1.19.0
  • OpenStack版本:Train
  • Kubernetes部署方式:kubeadm
  • Kubernetes网络组件:Calico v3.15.3

节点IP信息,/etc/hosts 内容如下:

1
2
3
4
5
6
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
111.111.111.121 node01
111.111.111.122 node02
111.111.111.123 node03
111.111.111.121 registry.local

节点网卡配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
root@node01: ~ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:8c:17:6f brd ff:ff:ff:ff:ff:ff
inet 111.111.111.121/24 brd 111.111.111.255 scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet 192.168.206.4/24 brd 192.168.206.255 scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet 111.111.111.120/32 brd 111.111.111.120 scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe8c:176f/64 scope link
valid_lft forever preferred_lft forever
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:8c:e4:71 brd ff:ff:ff:ff:ff:ff
inet6 2018::21/64 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 2018::20/64 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::6cd7:3d10:c933:a797/64 scope link noprefixroute
valid_lft forever preferred_lft forever
...

内核参数:/etc/sysctl.d/kubernetes.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.overcommit_memory=1 # 不检查物理内存是否够用
vm.panic_on_oom=0 # 开启 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
{% if enable_ipv6 | bool %}
net.ipv6.conf.all.disable_ipv6=0
net.ipv6.conf.all.forwarding=1
net.ipv6.conf.default.forwarding=1
{% else %}
net.ipv6.conf.all.disable_ipv6=1
{% endif %}

先决条件

为了使用 IPv4/IPv6 双栈的 Kubernetes 集群,需要满足以下先决条件:

  • Kubernetes 1.16 版本及更高版本
  • 提供商支持双协议栈网络(云提供商或其他提供商必须能够为 Kubernetes 节点提供可路由的 IPv4/IPv6 网络接口)
  • 支持双协议栈的网络插件(如 Kubenet 或 Calico)

启动 IPv4/IPv6 双协议栈

Kubernetes

官网文档配置信息如下:

要启用 IPv4/IPv6 双协议栈,为集群的相关组件启用 IPv6DualStack 特性门控, 并且设置双协议栈的集群网络分配:

  • kube-apiserver:
    • --feature-gates="IPv6DualStack=true"
    • --service-cluster-ip-range=<IPv4 CIDR>,<IPv6 CIDR>
  • kube-controller-manager:
    • --feature-gates="IPv6DualStack=true"
    • --cluster-cidr=<IPv4 CIDR>,<IPv6 CIDR> 例如 --cluster-cidr=10.244.0.0/16,fc00::/48
    • --service-cluster-ip-range=<IPv4 CIDR>,<IPv6 CIDR> 例如 --service-cluster-ip-range=10.0.0.0/16,fd00::/108
    • --node-cidr-mask-size-ipv4|--node-cidr-mask-size-ipv6 对于 IPv4 默认为 /24,对于 IPv6 默认为 /64
  • kubelet:
    • --feature-gates="IPv6DualStack=true"
  • kube-proxy:
    • --cluster-cidr=<IPv4 CIDR>,<IPv6 CIDR>
    • --feature-gates="IPv6DualStack=true"

部署方式使用 kubeadm,master节点初始化时,使用yaml文件配置进行初始化。yaml配置信息修改如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "111.111.111.121"
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: "node01"
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# 配置Kubelet,开启IPv6DualStack
featureGates:
IPv6DualStack: true
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "111.111.111.120:6443"
controllerManager: {}
# 配置Cluster,开启IPv6DualStack
featureGates:
IPv6DualStack: true
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: "k8s.gcr.io"
kind: ClusterConfiguration
kubernetesVersion: "v1.19.0"
networking:
dnsDomain: cluster.local
# 配置pod、service的ipv6地址池
# ipv6 CIDR不能过大(<= 20),否则会报错"specified --secondary-service-cluster-ip-range is too large",相关检测代码见下模块
podSubnet: 10.10.0.0/16,2019:20::/112
serviceSubnet: 10.96.0.0/12,2019:30::/112
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# 配置KubeProxy,开启IPv6DualStack
featureGates:
SupportIPVSProxyMode: true
IPv6DualStack: true
# mode配置:
# ipvs在IPv6DualStack模式下,创建的nodePort service使用< node ipv6 ip >:< nodePort >无法访问。iptables无此问题
mode: iptables

Kubernetes Service Subnet检测代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// TODO: Longer term we should read this from some config store, rather than a flag.
// validateClusterIPFlags is expected to be called after Complete()
func validateClusterIPFlags(options *ServerRunOptions) []error {
var errs []error

... // 省略

// note: While the cluster might be dualstack (i.e. pods with multiple IPs), the user may choose
// to only ingress traffic within and into the cluster on one IP family only. this family is decided
// by the range set on --service-cluster-ip-range. If/when the user decides to use dual stack services
// the Secondary* must be of different IPFamily than --service-cluster-ip-range
if secondaryServiceClusterIPRangeUsed {
// Should be dualstack IPFamily(PrimaryServiceClusterIPRange) != IPFamily(SecondaryServiceClusterIPRange)
dualstack, err := netutils.IsDualStackCIDRs([]*net.IPNet{&options.PrimaryServiceClusterIPRange, &options.SecondaryServiceClusterIPRange})
if err != nil {
errs = append(errs, errors.New("error attempting to validate dualstack for --service-cluster-ip-range and --secondary-service-cluster-ip-range"))
}

if !dualstack {
errs = append(errs, errors.New("--service-cluster-ip-range and --secondary-service-cluster-ip-range must be of different IP family"))
}

// should be smallish sized cidr, this thing is kept in etcd
// bigger cidr (specially those offered by IPv6) will add no value
// significantly increase snapshotting time.
var ones, bits = options.SecondaryServiceClusterIPRange.Mask.Size()
if bits-ones > 20 {
errs = append(errs, errors.New("specified --secondary-service-cluster-ip-range is too large"))
}
}

return errs
}

Calico

  1. Edit the CNI config (calico-config ConfigMap in the manifest), and enable IPv4 and IPv6 address allocation by setting both fields to true.
1
2
3
4
5
"ipam": {
"type": "calico-ipam",
"assign_ipv4": "true",
"assign_ipv6": "true"
},

calico-config ConfigMap 完整配置项,供参考:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "bird"
# Configure the MTU to use for workload interfaces and tunnels.
# - If Wireguard is enabled, set to your network MTU - 60
# - Otherwise, if VXLAN or BPF mode is enabled, set to your network MTU - 50
# - Otherwise, if IPIP is enabled, set to your network MTU - 20
# - Otherwise, if not using any encapsulation, set to your network MTU.
veth_mtu: "1440"

# The CNI network configuration to install on each node. The special
# values in this config will be automatically populated.
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam",
"assign_ipv4": "true",
"assign_ipv6": "true"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
  1. Configure IPv6 support by adding the following variable settings to the environment for the calico-node container
Variable name Value Comments
IP6 "autodetect"
IP6_AUTODETECTION_METHOD "interface=ens224" 网卡为配置ipv6地址的网卡
FELIX_IPV6SUPPORT "true"
CALICO_IPV6POOL_CIDR 2018:100::/112 配置IPv6地址池

验证

测试准备

使用Nginx,简单创建一个pod和service,验证功能

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
apiVersion: apps/v1
kind: Deployment
metadata:
name: myweb
spec:
replicas: 1
selector:
matchLabels:
app: myweb
template:
metadata:
labels:
app: myweb
spec:
nodeSelector:
type: node01
containers:
- image: nginx:latest
name: myweb
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 60
timeoutSeconds: 20
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 60
timeoutSeconds: 20
---
apiVersion: v1
kind: Service
metadata:
labels:
app: myweb
name: myweb
spec:
## Kubernetes v1.19 不支持Service同时配置IPv4/IPv6。
# 设置.spec.ipFamily为:
# IPv4:API 服务器将从 service-cluster-ip-range 中分配 ipv4 地址
# IPv6:API 服务器将从 service-cluster-ip-range 中分配 ipv6 地址
ipFamily: IPv6
ports:
- port: 80
targetPort: 80
nodePort: 31111
selector:
app: myweb
type: NodePort

网络测试:busybox.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: busybox:latest
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always

验证寻址

验证节点寻址

示例中,节点名称为 node01

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
## 验证是否配置了 IPv4/IPv6 Pod 地址范围
kubectl get nodes node01 -o go-template --template='{{range .spec.podCIDRs}}{{printf "%s\n" .}}{{end}}'

## 期望输出:
# 10.10.0.0/24
# 2019:20::/120

## 验证节点是否检测到 IPv4 和 IPv6 接口
kubectl get nodes node01 -o go-template --template='{{range .status.addresses}}{{printf "%s: %s \n" .type .address}}{{end}}'

## 期望输出:
# InternalIP: 2018::21
# InternalIP: 111.111.111.121
# Hostname: node01

# 实际测试:未输出IPv6地址,仅输出IPv4地址和Hostname

验证 Pod 寻址

示例中,Pod名称为 myweb-7645d99c58-v2wfx

1
2
3
4
5
6
## 验证Pod是否获得 IPv4/IPv6 地址
kubectl get pods myweb-7645d99c58-v2wfx -o go-template --template='{{range .status.podIPs}}{{printf "%s \n" .ip}}{{end}}'

## 期望输出:
# 10.10.140.65
# 2019:20::8c40

验证服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
## 查看测试准备流程中创建的Service
kubectl get svc

## 期望输出:
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h53m
# myweb NodePort 2019:30::73fe <none> 80:31111/TCP 3h51m

## 验证NodePort(使用节点IPv6地址)
curl -g -6 [2018::22]:31111

# PS:
# kube proxy使用iptables,CLUETER-IP无法通过ping进行测试,并且在集群节点是无法访问的,需要从pod内部访问。
# 进入busybox pod,执行如下两条命令测试:
# - wget [2019:30::73fe],访问成功
# - telnet 2019:30::73fe 80,显示Connected to 2019:30::73fe

参考文档