In this post, we’re going to experiment with deploying Linkerd 2.x to a Kubernetes cluster that uses network policies to govern ingress network traffic between pods.
UPDATE (Sept 24, 2020): The linkerd inject
command has since been modified to use the same code path as the auto proxy injection feature. When applying the deny-all-ingress
network policy to the linkerd
namespace, ensure that traffic from the K8s API server is permitted to port 443 of the proxy injector, service profile validator and tap components.
About Network Policy
A network policy defines ingress and egress rules to control communication between pods in a cluster. It is a namespace-scoped resource. Pods are selected by specifying label selectors in the policy.
By default, there are no restrictions on the pod-to-pod communication within a Kubernetes cluster. Once a pod is selected by a network policy, traffic is accepted and rejected based on the rules defined in the policy.
When using network policy, it is important to ensure that your Kubernetes cluster is deployed with a networking solution that supports network policies.
Cluster Setup
The steps in this post are performed with:
- Linkerd 2.x edge-19.1.2
- GKE 1.11.6-gke.0 (which uses Calico to support network policy)
Installation instruction for the Linkerd 2.x CLI can be found here.
The GKE cluster is comprised of 3 nodes in the us-west1 region with network policy enabled. It uses alias IP addresses in the 10.0.0.0/16 range as the pod address range.
Defining The Policies
All the rules in the network policies use the namespaceSelector
and podSelector
label selectors. The labels specified in the selectors can be changed to suit your convention.
At the time of this writing, there isn’t a convenient way to add new labels to all new and existing the proxies across the entire cluster. See issue #2001.
The following is the list of network policies that are deployed to the cluster.
All namespaces start with the deny-all-ingress
network policy which denies all ingress traffic, except for those generated from within the same namespace. It’s not uncommon to see this kind of network policy being used to enforce namespace-based soft multi-tenancy in Kubernetes clusters.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-all-ingress
spec:
podSelector: {}
ingress:
- from:
- podSelector: {}
The allowed-meshed-ingress
policy is applied to the linkerd
namespace. This policy allows traffic from meshed pods from all namespaces.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-meshed-ingress
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector: {}
podSelector:
matchExpressions:
- key: linkerd.io/proxy-component
operator: Exists
In the emojivoto
application namespace, we add the allow-control-plane-ingress
network policy to allow for traffic from the Linkerd control plane. The linkerd.io/control-plane-ns: linkerd
selector is an arbitrary choice.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-control-plane-ingress
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
linkerd.io/control-plane-ns: linkerd
Deploying The Linkerd Control Plane
Create the linkerd
namespace, and apply the deny-all-ingress
network policy:
$ kubectl create ns linkerd$ cat <<EOF | kubectl -n linkerd apply -f -
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-all-ingress
spec:
podSelector: {}
ingress:
- from:
- podSelector: {}
EOF
Install the Linkerd control plane with automatic TLS enabled:
$ linkerd install --tls=optional | kubectl apply -f -
Run some checks on the control plane:
$ linkerd check
kubernetes-api
--------------
✔ can initialize the client
✔ can query the Kubernetes API
kubernetes-version
------------------
✔ is running the minimum Kubernetes API version
linkerd-existence
-----------------
✔ control plane namespace exists
✔ controller pod is running
✔ can initialize the client
✔ can query the control plane API
linkerd-api
-----------
✔ control plane pods are ready
✔ can query the control plane API
✔ [kubernetes] control plane can talk to Kubernetes
✔ [prometheus] control plane can talk to Prometheus
linkerd-service-profile
-----------------------
✔ no invalid service profiles
linkerd-version
---------------
✔ can determine the latest version
✔ cli is up-to-date
control-plane-version
---------------------
✔ control plane is up-to-date
Status check results are ✔
Run check
on the data plane:
$ linkerd check -n linkerd --proxy
kubernetes-api
--------------
✔ can initialize the client
✔ can query the Kubernetes API
kubernetes-version
------------------
✔ is running the minimum Kubernetes API version
linkerd-existence
-----------------
✔ control plane namespace exists
✔ controller pod is running
✔ can initialize the client
✔ can query the control plane API
linkerd-api
-----------
✔ control plane pods are ready
✔ can query the control plane API
✔ [kubernetes] control plane can talk to Kubernetes
✔ [prometheus] control plane can talk to Prometheus
linkerd-service-profile
-----------------------
✔ no invalid service profiles
linkerd-version
---------------
✔ can determine the latest version
✔ cli is up-to-date
linkerd-data-plane
------------------
✔ data plane namespace exists
✔ data plane proxies are ready
✔ data plane proxy metrics are present in Prometheus
✔ data plane is up-to-date
Status check results are ✔
Launch the Linkerd dashboard.
$ linkerd dashboard
It looks like everything is working 👍.
To see the effect of the deny-all-ingress
network policy, use a curl
pod to access the Linkerd controller API from within and outside of the linkerd
namespace:
# run curl in the linkerd namespace
$ kubectl -n linkerd run curl --image=appropriate/curl --rm -it --restart=Never --command -- curl -o - linkerd-controller-api.linkerd:8085POST required # response# run curl in the default namespace
$ kubectl -n default run curl --image=appropriate/curl --rm -it --restart=Never --command -- curl -o - linkerd-controller-api.linkerd:8085
If you don't see a command prompt, try pressing enter.
curl: (7) Failed to connect to linkerd-controller-api.linkerd port 8085: Operation timed out
pod default/curl terminated (Error)
Deploying The Application
Create the emojivoto
namespace and apply the same deny-all-ingress
network policy to it.
$ kubectl create ns emojivoto$ cat <<EOF | kubectl -n emojivoto apply -f -
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-all-ingress
spec:
podSelector: {}
ingress:
- from:
- podSelector: {}
EOF
Deploy and mesh the emojivoto application.
$ curl -sL https://run.linkerd.io/emojivoto.yml | linkerd inject --tls=optional - | kubectl apply -f -
Run a check
on the proxies in the emojivoto
namespace.
$ linkerd check -n emojivoto --proxy
kubernetes-api
--------------
✔ can initialize the client
✔ can query the Kubernetes API
kubernetes-version
------------------
✔ is running the minimum Kubernetes API version
linkerd-existence
-----------------
✔ control plane namespace exists
✔ controller pod is running
✔ can initialize the client
✔ can query the control plane API
linkerd-api
-----------
✔ control plane pods are ready
✔ can query the control plane API
✔ [kubernetes] control plane can talk to Kubernetes
✔ [prometheus] control plane can talk to Prometheus
linkerd-service-profile
-----------------------
✔ no invalid service profiles
linkerd-version
---------------
✔ can determine the latest version
✔ cli is up-to-date
linkerd-data-plane
------------------
✔ data plane namespace exists
✔ data plane proxies are ready
⠁ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf, emojivoto/vote-bot-696
⠁ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf, emojivoto/vote-bot-696
⠉ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf, emojivoto/vote-bot-696
⠙ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf, emojivoto/vote-bot-696
⠚ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf, emojivoto/vote-bot-696
⠒ data plane proxy metrics are present in Prometheus -- Data plane metrics not found for emojivoto/emoji-8df6758bb-5hnqx, emojivoto/voting-5475cbcc54-lnjxg, emojivoto/web-5cdbcd84d-gn6zf,
....
Notice the data plane proxy metrics not found
errors. This happens because our network policy is restricting the ingress traffic to the linkerd
namespace.
On the Linkerd dashboard, we see that there are no metrics from the emojivoto
namespace 😞.
Try to access the emojivoto web application at localhost:8080 using port forwarding.
$ kubectl -n emojivoto port-forward $(kubectl -n emojivoto get po -l app=web-svc -oname | cut -d/ -f 2) 8080:80
Notice that the application doesn’t load up all the emojis 😧.
Debugging It
Looking at the logs of the linkerd-proxy
container of the web
pod (after a few minutes), we see that it’s timing out while attempting to connect to the linkerd-proxy-api.linkerd.svc.cluster.local:8086
$ kubectl -n emojivoto logs web-596f8df69d-mxj5j linkerd-proxy
INFO linkerd2_proxy::app::main using controller at Some(Name(NameAddr { name: DnsName(DNSName("linkerd-proxy-api.linkerd.svc.cluster.local")), port: 8086 }))
INFO linkerd2_proxy::app::main routing on V4(127.0.0.1:4140)
INFO linkerd2_proxy::app::main proxying on V4(0.0.0.0:4143) to None
INFO linkerd2_proxy::app::main serving Prometheus metrics on V4(0.0.0.0:4191)
INFO linkerd2_proxy::app::main protocol detection disabled for inbound ports {25, 3306}
INFO linkerd2_proxy::app::main protocol detection disabled for outbound ports {25, 3306}
INFO admin={bg=tls-config} linkerd2_proxy::transport::tls::config loaded TLS configuration.
INFO admin={bg=tls-config} linkerd2_proxy::transport::tls::config loaded TLS configuration.
INFO admin={bg=tls-config} linkerd2_proxy::transport::tls::config loaded TLS configuration.
INFO admin={bg=tls-config} linkerd2_proxy::transport::tls::config loaded TLS configuration.
INFO admin={bg=tls-config} linkerd2_proxy::transport::tls::config loaded TLS configuration.
ERR! proxy={server=out listen=127.0.0.1:4140 remote=10.0.2.124:51888} linkerd2_proxy::proxy::http::router service error: operation timed out after 10s
WARN linkerd-proxy-api.linkerd.svc.cluster.local:8086 linkerd2_proxy::proxy::reconnect connect error to Config { addr: Name(NameAddr { name: DnsName(DNSName("linkerd-proxy-api.linkerd.svc.cluster.local")), port
: 8086 }), tls_server_identity: Some(Identity(DnsName(DNSName("controller.deployment.linkerd.linkerd-managed.linkerd.svc.cluster.local")))), tls_config: Some(ClientConfig), backoff: 5s, connect_timeout: 3s }: C
onnection timed out (os error 110) (address: 172.16.10.238:8086)
When we try to target the linkerd-proxy-api.linkerd
service from the emojivoto
namespace using curl
, we get a timeout error:
$ kubectl -n emojivoto run curl --image=appropriate/curl --rm -it --restart=Never --command -- curl -o - linkerd-proxy-api.linkerd.svc.cluster.local:8086
If you don't see a command prompt, try pressing enter.
curl: (7) Failed to connect to linkerd-proxy-api.linkerd.svc.cluster.local port 8086: Operation timed out
pod emojivoto/curl terminated (Error)
(This works if the curl
is run in the linkerd
namespace.)
Fixing It
Apply the allow-meshed-ingress
network policy to the linkerd
namespace. This will allow traffic from the proxies in the emojivoto
namespace to reach the control plane.
$ cat <<EOF | kubectl -n linkerd apply -f -
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-meshed-ingress
spec:
podSelector: {} # select all pods
ingress:
- from:
- namespaceSelector: {}
podSelector:
matchExpressions:
- key: linkerd.io/proxy-deployment
operator: Exists
EOF
If automatic TLS is enabled via the — tls=optional
flag, we will start to see some connection reset by peer
errors on the Linkerd controller linkerd-proxy
container. The remote
IPs in the logs are those of the emojivoto pods.
$ kubectl -n linkerd logs -f linkerd-controller-699fd5656d-brfwh linkerd-proxy
...
INFO proxy={server=in listen=0.0.0.0:4143 remote=10.0.0.114:32944} linkerd2_proxy::proxy::tcp forward duplex error: Connection reset by peer (os error 104)
INFO proxy={server=in listen=0.0.0.0:4143 remote=10.0.1.103:36094} linkerd2_proxy::proxy::tcp forward duplex error: Connection reset by peer (os error 104)
INFO proxy={server=in listen=0.0.0.0:4143 remote=10.0.2.131:58888} linkerd2_proxy::proxy::tcp forward duplex error: Connection reset by peer (os error 104)
...
The next step isn’t necessary if automatic TLS isn’t enabled.
Label the linkerd
namespace with a custom label.
$ kubectl label ns linkerd linkerd.io/control-plane-ns=linkerd
This label must match the selector that will be used in the following network policy.
Apply the allow-meshed-ingress
network policy to the emojivoto
namespace.
$ cat <<EOF | kubectl -n emojivoto apply -f -
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-control-plane-ingress
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
linkerd.io/control-plane-ns: linkerd
EOF
This policy allows incoming traffic from the control plane components to the emojivoto
namespace.
Does It Work?
After a few minutes, the dashboard starts displaying metrics from the emojivoto
namespace 🎉 🎉 🎉
The emojivoto application also works as intended 🎆 🎆 🎆.
Let’s tap
into some live traffic of the web
component:
Use top
to see sorted information about live traffic.
Grafana still works 👇 👏:
Conclusion
In this post, we deployed Linkerd 2.x to a Kubernetes cluster which uses network policy to secure pod-to-pod communication. First we applied a deny-all-ingress
network policy to the linkerd
and emojivoto
namespaces. Then we deployed additional network policies to allow meshed traffic to be exchanged between the two namespaces.
We used the namespace and pod label selector in the ‘allow’ ingress rules. Another supported selector which wasn’t shown was the ipBlock
selector. Refer the Kubernetes docs for more information.
Finally, shoutout to Ahmet Alp Balkan for the awesome recipes at https://github.com/ahmetb/kubernetes-network-policy-recipes. Check out his awesome post on Securing Kubernetes Cluster Networking.