installer for kubernetes cluster in VM's #2

Open
opened 2025-12-08 08:58:22 +00:00 by despiegk · 1 comment
Owner
  • 5 VM's
  • configure k3s 3 master 2 nodes
  • create rhaj scripts for configuration, execute in VM's by executing hero'd
    • push to VM & execute
- 5 VM's - configure k3s 3 master 2 nodes - create rhaj scripts for configuration, execute in VM's by executing hero'd - push to VM & execute
Author
Owner

K3s HA Installation on 5 VMs with Mycelium Network

Scenario Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                          Ubuntu 24.04 Host                                  │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                    Mycelium IPv6 Network (543::/7)                  │   │
│  │                         (Already Bridged)                           │   │
│  │                                                                     │   │
│  │  ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐           │   │
│  │  │  VM1   │ │  VM2   │ │  VM3   │ │  VM4   │ │  VM5   │           │   │
│  │  │Master1 │ │Master2 │ │Master3 │ │Worker1 │ │Worker2 │           │   │
│  │  └────────┘ └────────┘ └────────┘ └────────┘ └────────┘           │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                              Masquerade → Internet                          │
└─────────────────────────────────────────────────────────────────────────────┘

Step 1: Create Inventory File (on your workstation/host)

First, gather the Mycelium IPv6 addresses from each VM and create an inventory:

# Create working directory
mkdir -p ~/k3s-cluster && cd ~/k3s-cluster

# Create inventory file - EDIT with your actual VM IPs
cat > inventory.env << 'EOF'
# VM SSH access (use Mycelium IPv6 addresses)
# Get each VM's IP with: ssh user@vm "ip -6 addr show mycelium scope global"

MASTER1="543:66c5:6430:8f31:5293:1ad9:694a:70f3"
MASTER2="543:aaaa:bbbb:cccc:1111:2222:3333:4444"
MASTER3="543:dddd:eeee:ffff:5555:6666:7777:8888"
WORKER1="543:1111:2222:3333:aaaa:bbbb:cccc:dddd"
WORKER2="543:4444:5555:6666:eeee:ffff:0000:1111"

# SSH user for VMs
SSH_USER="root"

# Generate secure token
K3S_TOKEN="k3s-cluster-$(openssl rand -hex 16)"
EOF

source inventory.env
echo "K3S_TOKEN: $K3S_TOKEN"

Step 2: Prepare All VMs

Create and run the preparation script on all VMs:

#!/bin/bash
# Run from your workstation - prepares all VMs

source ~/k3s-cluster/inventory.env

# Preparation script to run on each VM
PREP_SCRIPT='#!/bin/bash
set -e

echo "=== Preparing VM for K3s ==="

# Update and install dependencies
apt update && apt install -y curl wget net-tools iptables nftables conntrack jq

# Enable forwarding
cat > /etc/sysctl.d/99-k3s.conf << SYSCTL
net.ipv6.conf.all.forwarding=1
net.ipv6.conf.default.forwarding=1
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
SYSCTL

modprobe br_netfilter
echo "br_netfilter" >> /etc/modules-load.d/k8s.conf
sysctl --system

# Disable swap
swapoff -a
sed -i "/swap/d" /etc/fstab

# Disable UFW
systemctl disable --now ufw 2>/dev/null || true

# Get network info
MYCELIUM_IP=$(ip -6 addr show dev mycelium scope global | grep -oP "(?<=inet6 )[0-9a-f:]+(?=/)")
echo "VM Mycelium IP: $MYCELIUM_IP"

# Find public interface for masquerading
PUBLIC_IF=$(ip route show default 2>/dev/null | head -1 | awk "{print \$5}")
if [ -z "$PUBLIC_IF" ]; then
    PUBLIC_IF=$(ip -4 route show default | head -1 | awk "{print \$5}")
fi
echo "Public interface: $PUBLIC_IF"

# Setup masquerading with nftables
mkdir -p /etc/nftables.d
cat > /etc/nftables.d/k3s-nat.conf << NFTEOF
#!/usr/sbin/nft -f
table inet k3s_nat {
    chain postrouting {
        type nat hook postrouting priority srcnat; policy accept;
        oifname "$PUBLIC_IF" ip saddr 10.42.0.0/16 masquerade
        oifname "$PUBLIC_IF" ip saddr 10.43.0.0/16 masquerade
        oifname "$PUBLIC_IF" ip6 saddr fd00::/8 masquerade
    }
}
table inet k3s_filter {
    chain forward {
        type filter hook forward priority filter; policy accept;
        ip saddr 10.42.0.0/16 accept
        ip daddr 10.42.0.0/16 accept
        ip6 saddr 543::/7 accept
        ip6 daddr 543::/7 accept
        ct state established,related accept
    }
}
NFTEOF

nft -f /etc/nftables.d/k3s-nat.conf
systemctl enable nftables

echo "=== VM Preparation Complete ==="
'

# Run on all VMs
ALL_VMS="$MASTER1 $MASTER2 $MASTER3 $WORKER1 $WORKER2"

for VM in $ALL_VMS; do
    echo "=========================================="
    echo "Preparing VM: $VM"
    echo "=========================================="
    ssh -o StrictHostKeyChecking=no ${SSH_USER}@${VM} "$PREP_SCRIPT"
    echo ""
done

echo "All VMs prepared successfully!"

Step 3: Install K3s on First Master

#!/bin/bash
# Run from workstation - installs first master

source ~/k3s-cluster/inventory.env

echo "=== Installing K3s on Master 1 ($MASTER1) ==="

ssh ${SSH_USER}@${MASTER1} << EOFMASTER1
set -e

MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)')
echo "Node IP: \$MYCELIUM_IP"

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -s - \\
    --cluster-init \\
    --token "${K3S_TOKEN}" \\
    --node-ip "\${MYCELIUM_IP}" \\
    --advertise-address "\${MYCELIUM_IP}" \\
    --node-external-ip "\${MYCELIUM_IP}" \\
    --flannel-iface mycelium \\
    --flannel-ipv6-masq \\
    --cluster-cidr "10.42.0.0/16,fd42::/48" \\
    --service-cidr "10.43.0.0/16,fd43::/112" \\
    --cluster-dns "10.43.0.10" \\
    --disable servicelb \\
    --disable traefik \\
    --tls-san "\${MYCELIUM_IP}" \\
    --node-label "node-role.kubernetes.io/master=true" \\
    --node-taint "node-role.kubernetes.io/master=true:NoSchedule" \\
    --write-kubeconfig-mode 644

echo "Waiting for K3s to start..."
sleep 20

# Verify
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
kubectl get nodes
EOFMASTER1

# Copy kubeconfig to workstation
echo "Copying kubeconfig..."
ssh ${SSH_USER}@${MASTER1} "cat /etc/rancher/k3s/k3s.yaml" | \
    sed "s/127.0.0.1/${MASTER1}/g" | \
    sed "s/\[::1\]/[${MASTER1}]/g" > ~/k3s-cluster/kubeconfig.yaml

echo "Kubeconfig saved to ~/k3s-cluster/kubeconfig.yaml"
echo "Use: export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml"

Step 4: Join Additional Masters

#!/bin/bash
# Run from workstation - joins master 2 and 3

source ~/k3s-cluster/inventory.env

for MASTER_IP in $MASTER2 $MASTER3; do
    echo "=========================================="
    echo "Joining Master: $MASTER_IP"
    echo "=========================================="
    
    ssh ${SSH_USER}@${MASTER_IP} << EOFMASTER
set -e

MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)')
echo "Node IP: \$MYCELIUM_IP"
echo "Joining via: ${MASTER1}"

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -s - \\
    --server "https://[${MASTER1}]:6443" \\
    --token "${K3S_TOKEN}" \\
    --node-ip "\${MYCELIUM_IP}" \\
    --advertise-address "\${MYCELIUM_IP}" \\
    --node-external-ip "\${MYCELIUM_IP}" \\
    --flannel-iface mycelium \\
    --flannel-ipv6-masq \\
    --cluster-cidr "10.42.0.0/16,fd42::/48" \\
    --service-cidr "10.43.0.0/16,fd43::/112" \\
    --cluster-dns "10.43.0.10" \\
    --disable servicelb \\
    --disable traefik \\
    --tls-san "\${MYCELIUM_IP}" \\
    --node-label "node-role.kubernetes.io/master=true" \\
    --node-taint "node-role.kubernetes.io/master=true:NoSchedule" \\
    --write-kubeconfig-mode 644

echo "Waiting for node to join..."
sleep 15
EOFMASTER

    echo ""
done

# Check cluster status
export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml
echo "=== Cluster Status ==="
kubectl get nodes

Step 5: Join Worker Nodes

#!/bin/bash
# Run from workstation - joins workers

source ~/k3s-cluster/inventory.env

for WORKER_IP in $WORKER1 $WORKER2; do
    echo "=========================================="
    echo "Joining Worker: $WORKER_IP"
    echo "=========================================="
    
    ssh ${SSH_USER}@${WORKER_IP} << EOFWORKER
set -e

MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)')
echo "Node IP: \$MYCELIUM_IP"
echo "Joining via: ${MASTER1}"

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="agent" sh -s - \\
    --server "https://[${MASTER1}]:6443" \\
    --token "${K3S_TOKEN}" \\
    --node-ip "\${MYCELIUM_IP}" \\
    --node-external-ip "\${MYCELIUM_IP}" \\
    --flannel-iface mycelium

echo "Worker joined!"
EOFWORKER

    echo ""
done

# Final status
export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml
echo "=== Final Cluster Status ==="
kubectl get nodes -o wide

Step 6: Verification Tests

6.1 All-in-One Test Script

#!/bin/bash
# Comprehensive cluster verification - run from workstation

source ~/k3s-cluster/inventory.env
export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml

echo "╔═══════════════════════════════════════════════════════════════╗"
echo "║            K3s Cluster Verification Tests                     ║"
echo "╚═══════════════════════════════════════════════════════════════╝"

# Test 1: Node Status
echo -e "\n[TEST 1] Node Status"
echo "─────────────────────────────────────────"
kubectl get nodes -o wide
READY_NODES=$(kubectl get nodes --no-headers | grep -c " Ready")
TOTAL_NODES=$(kubectl get nodes --no-headers | wc -l)
if [ "$READY_NODES" -eq 5 ]; then
    echo "✓ All 5 nodes Ready"
else
    echo "✗ Only $READY_NODES/$TOTAL_NODES nodes Ready"
fi

# Test 2: System Pods
echo -e "\n[TEST 2] System Pods"
echo "─────────────────────────────────────────"
kubectl get pods -n kube-system
NOT_RUNNING=$(kubectl get pods -n kube-system --no-headers | grep -v "Running\|Completed" | wc -l)
if [ "$NOT_RUNNING" -eq 0 ]; then
    echo "✓ All system pods healthy"
else
    echo "✗ $NOT_RUNNING pods not running"
fi

# Test 3: Mycelium Connectivity
echo -e "\n[TEST 3] Mycelium IPv6 Connectivity"
echo "─────────────────────────────────────────"
ALL_VMS="$MASTER1 $MASTER2 $MASTER3 $WORKER1 $WORKER2"
for VM in $ALL_VMS; do
    echo -n "  Ping $VM: "
    if ping6 -c 1 -W 2 "$VM" > /dev/null 2>&1; then
        echo "✓"
    else
        echo "✗"
    fi
done

# Test 4: K3s API on all masters
echo -e "\n[TEST 4] K3s API Endpoints"
echo "─────────────────────────────────────────"
for MASTER_IP in $MASTER1 $MASTER2 $MASTER3; do
    echo -n "  API at [$MASTER_IP]:6443: "
    if curl -sk --connect-timeout 3 "https://[$MASTER_IP]:6443/healthz" 2>/dev/null | grep -q ok; then
        echo "✓"
    else
        echo "✗"
    fi
done

# Test 5: Deploy test pods
echo -e "\n[TEST 5] Deploying Test Workload"
echo "─────────────────────────────────────────"
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: k3s-test
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: network-test
  namespace: k3s-test
spec:
  selector:
    matchLabels:
      app: network-test
  template:
    metadata:
      labels:
        app: network-test
    spec:
      tolerations:
        - operator: "Exists"
      containers:
        - name: test
          image: busybox:latest
          command: ["sleep", "3600"]
          resources:
            limits:
              memory: "32Mi"
              cpu: "50m"
EOF

echo "Waiting for test pods..."
kubectl wait --for=condition=ready pod -l app=network-test -n k3s-test --timeout=120s 2>/dev/null
kubectl get pods -n k3s-test -o wide

# Test 6: Pod-to-Pod Communication
echo -e "\n[TEST 6] Pod-to-Pod Communication"
echo "─────────────────────────────────────────"
SOURCE_POD=$(kubectl get pods -n k3s-test -o jsonpath='{.items[0].metadata.name}')
POD_IPS=$(kubectl get pods -n k3s-test -o jsonpath='{range .items[*]}{.status.podIP}{" "}{end}')

for POD_IP in $POD_IPS; do
    echo -n "  Ping $POD_IP: "
    if kubectl exec -n k3s-test "$SOURCE_POD" -- ping -c 1 -W 2 "$POD_IP" > /dev/null 2>&1; then
        echo "✓"
    else
        echo "✗"
    fi
done

# Test 7: Internet Access (Masquerading)
echo -e "\n[TEST 7] Internet Access (Masquerading)"
echo "─────────────────────────────────────────"

echo -n "  DNS Resolution: "
if kubectl exec -n k3s-test "$SOURCE_POD" -- nslookup google.com > /dev/null 2>&1; then
    echo "✓"
else
    echo "✗"
fi

echo -n "  HTTP Access: "
if kubectl exec -n k3s-test "$SOURCE_POD" -- wget -q -O /dev/null --timeout=5 http://google.com 2>/dev/null; then
    echo "✓"
else
    echo "✗"
fi

echo -n "  HTTPS Access: "
if kubectl exec -n k3s-test "$SOURCE_POD" -- wget -q -O /dev/null --timeout=5 https://google.com 2>/dev/null; then
    echo "✓"
else
    echo "✗"
fi

# Test 8: Service Discovery
echo -e "\n[TEST 8] Service Discovery"
echo "─────────────────────────────────────────"
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: test-svc
  namespace: k3s-test
spec:
  selector:
    app: network-test
  ports:
    - port: 80
EOF

SVC_IP=$(kubectl get svc test-svc -n k3s-test -o jsonpath='{.spec.clusterIP}')
echo "  Service IP: $SVC_IP"
echo -n "  DNS lookup test-svc.k3s-test.svc.cluster.local: "
if kubectl exec -n k3s-test "$SOURCE_POD" -- nslookup test-svc.k3s-test.svc.cluster.local > /dev/null 2>&1; then
    echo "✓"
else
    echo "✗"
fi

# Summary
echo -e "\n╔═══════════════════════════════════════════════════════════════╗"
echo "║                    Test Summary                                ║"
echo "╚═══════════════════════════════════════════════════════════════╝"
echo "Cluster: $(kubectl get nodes --no-headers | grep -c Ready)/5 nodes ready"
echo "System:  $(kubectl get pods -n kube-system --no-headers | grep -c Running) pods running"
echo "Test:    $(kubectl get pods -n k3s-test --no-headers | grep -c Running) test pods deployed"
echo ""
echo "Kubeconfig: ~/k3s-cluster/kubeconfig.yaml"

6.2 Test HA Failover

#!/bin/bash
# Test HA by stopping a master

source ~/k3s-cluster/inventory.env
export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml

echo "=== HA Failover Test ==="
echo ""
echo "Current state:"
kubectl get nodes

echo ""
echo "Stopping K3s on Master 2..."
ssh ${SSH_USER}@${MASTER2} "systemctl stop k3s"

echo "Waiting 10 seconds..."
sleep 10

echo ""
echo "Cluster state with Master 2 down:"
kubectl get nodes

echo ""
echo "Testing API still works:"
kubectl get pods -n kube-system --no-headers | head -5

echo ""
echo "Restarting Master 2..."
ssh ${SSH_USER}@${MASTER2} "systemctl start k3s"

echo "Waiting 20 seconds for rejoin..."
sleep 20

echo ""
echo "Final cluster state:"
kubectl get nodes

Step 7: Cleanup Test Resources

#!/bin/bash
# Clean up test namespace

export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml
kubectl delete namespace k3s-test
echo "Test resources cleaned up"

Quick Reference Commands

# Set kubeconfig
export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml

# View cluster
kubectl get nodes -o wide
kubectl get pods -A

# SSH to a VM
source ~/k3s-cluster/inventory.env
ssh root@$MASTER1

# View k3s logs on a VM
ssh root@$MASTER1 "journalctl -u k3s -f"

# Check token (if lost)
ssh root@$MASTER1 "cat /var/lib/rancher/k3s/server/token"

# Uninstall k3s from a master
ssh root@$MASTER1 "/usr/local/bin/k3s-uninstall.sh"

# Uninstall k3s from a worker
ssh root@$WORKER1 "/usr/local/bin/k3s-agent-uninstall.sh"

Troubleshooting

Issue Debug Command Solution
Node not joining ssh VM "journalctl -u k3s -n 50" Check token, verify IPv6 connectivity
Pods can't reach internet ssh VM "nft list ruleset" Verify masquerade rules, check public interface
Flannel errors ssh VM "cat /var/lib/rancher/k3s/agent/etc/flannel/net-conf.json" Verify flannel-iface mycelium
DNS not working kubectl logs -n kube-system -l k8s-app=kube-dns Check coredns pods
etcd issues ssh MASTER "journalctl -u k3s | grep etcd" Ensure time is synced across masters
# K3s HA Installation on 5 VMs with Mycelium Network ## Scenario Overview ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ Ubuntu 24.04 Host │ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ Mycelium IPv6 Network (543::/7) │ │ │ │ (Already Bridged) │ │ │ │ │ │ │ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ │ │ VM1 │ │ VM2 │ │ VM3 │ │ VM4 │ │ VM5 │ │ │ │ │ │Master1 │ │Master2 │ │Master3 │ │Worker1 │ │Worker2 │ │ │ │ │ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ Masquerade → Internet │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Step 1: Create Inventory File (on your workstation/host) First, gather the Mycelium IPv6 addresses from each VM and create an inventory: ```bash # Create working directory mkdir -p ~/k3s-cluster && cd ~/k3s-cluster # Create inventory file - EDIT with your actual VM IPs cat > inventory.env << 'EOF' # VM SSH access (use Mycelium IPv6 addresses) # Get each VM's IP with: ssh user@vm "ip -6 addr show mycelium scope global" MASTER1="543:66c5:6430:8f31:5293:1ad9:694a:70f3" MASTER2="543:aaaa:bbbb:cccc:1111:2222:3333:4444" MASTER3="543:dddd:eeee:ffff:5555:6666:7777:8888" WORKER1="543:1111:2222:3333:aaaa:bbbb:cccc:dddd" WORKER2="543:4444:5555:6666:eeee:ffff:0000:1111" # SSH user for VMs SSH_USER="root" # Generate secure token K3S_TOKEN="k3s-cluster-$(openssl rand -hex 16)" EOF source inventory.env echo "K3S_TOKEN: $K3S_TOKEN" ``` --- ## Step 2: Prepare All VMs Create and run the preparation script on all VMs: ```bash #!/bin/bash # Run from your workstation - prepares all VMs source ~/k3s-cluster/inventory.env # Preparation script to run on each VM PREP_SCRIPT='#!/bin/bash set -e echo "=== Preparing VM for K3s ===" # Update and install dependencies apt update && apt install -y curl wget net-tools iptables nftables conntrack jq # Enable forwarding cat > /etc/sysctl.d/99-k3s.conf << SYSCTL net.ipv6.conf.all.forwarding=1 net.ipv6.conf.default.forwarding=1 net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 SYSCTL modprobe br_netfilter echo "br_netfilter" >> /etc/modules-load.d/k8s.conf sysctl --system # Disable swap swapoff -a sed -i "/swap/d" /etc/fstab # Disable UFW systemctl disable --now ufw 2>/dev/null || true # Get network info MYCELIUM_IP=$(ip -6 addr show dev mycelium scope global | grep -oP "(?<=inet6 )[0-9a-f:]+(?=/)") echo "VM Mycelium IP: $MYCELIUM_IP" # Find public interface for masquerading PUBLIC_IF=$(ip route show default 2>/dev/null | head -1 | awk "{print \$5}") if [ -z "$PUBLIC_IF" ]; then PUBLIC_IF=$(ip -4 route show default | head -1 | awk "{print \$5}") fi echo "Public interface: $PUBLIC_IF" # Setup masquerading with nftables mkdir -p /etc/nftables.d cat > /etc/nftables.d/k3s-nat.conf << NFTEOF #!/usr/sbin/nft -f table inet k3s_nat { chain postrouting { type nat hook postrouting priority srcnat; policy accept; oifname "$PUBLIC_IF" ip saddr 10.42.0.0/16 masquerade oifname "$PUBLIC_IF" ip saddr 10.43.0.0/16 masquerade oifname "$PUBLIC_IF" ip6 saddr fd00::/8 masquerade } } table inet k3s_filter { chain forward { type filter hook forward priority filter; policy accept; ip saddr 10.42.0.0/16 accept ip daddr 10.42.0.0/16 accept ip6 saddr 543::/7 accept ip6 daddr 543::/7 accept ct state established,related accept } } NFTEOF nft -f /etc/nftables.d/k3s-nat.conf systemctl enable nftables echo "=== VM Preparation Complete ===" ' # Run on all VMs ALL_VMS="$MASTER1 $MASTER2 $MASTER3 $WORKER1 $WORKER2" for VM in $ALL_VMS; do echo "==========================================" echo "Preparing VM: $VM" echo "==========================================" ssh -o StrictHostKeyChecking=no ${SSH_USER}@${VM} "$PREP_SCRIPT" echo "" done echo "All VMs prepared successfully!" ``` --- ## Step 3: Install K3s on First Master ```bash #!/bin/bash # Run from workstation - installs first master source ~/k3s-cluster/inventory.env echo "=== Installing K3s on Master 1 ($MASTER1) ===" ssh ${SSH_USER}@${MASTER1} << EOFMASTER1 set -e MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)') echo "Node IP: \$MYCELIUM_IP" curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -s - \\ --cluster-init \\ --token "${K3S_TOKEN}" \\ --node-ip "\${MYCELIUM_IP}" \\ --advertise-address "\${MYCELIUM_IP}" \\ --node-external-ip "\${MYCELIUM_IP}" \\ --flannel-iface mycelium \\ --flannel-ipv6-masq \\ --cluster-cidr "10.42.0.0/16,fd42::/48" \\ --service-cidr "10.43.0.0/16,fd43::/112" \\ --cluster-dns "10.43.0.10" \\ --disable servicelb \\ --disable traefik \\ --tls-san "\${MYCELIUM_IP}" \\ --node-label "node-role.kubernetes.io/master=true" \\ --node-taint "node-role.kubernetes.io/master=true:NoSchedule" \\ --write-kubeconfig-mode 644 echo "Waiting for K3s to start..." sleep 20 # Verify export KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl get nodes EOFMASTER1 # Copy kubeconfig to workstation echo "Copying kubeconfig..." ssh ${SSH_USER}@${MASTER1} "cat /etc/rancher/k3s/k3s.yaml" | \ sed "s/127.0.0.1/${MASTER1}/g" | \ sed "s/\[::1\]/[${MASTER1}]/g" > ~/k3s-cluster/kubeconfig.yaml echo "Kubeconfig saved to ~/k3s-cluster/kubeconfig.yaml" echo "Use: export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml" ``` --- ## Step 4: Join Additional Masters ```bash #!/bin/bash # Run from workstation - joins master 2 and 3 source ~/k3s-cluster/inventory.env for MASTER_IP in $MASTER2 $MASTER3; do echo "==========================================" echo "Joining Master: $MASTER_IP" echo "==========================================" ssh ${SSH_USER}@${MASTER_IP} << EOFMASTER set -e MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)') echo "Node IP: \$MYCELIUM_IP" echo "Joining via: ${MASTER1}" curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -s - \\ --server "https://[${MASTER1}]:6443" \\ --token "${K3S_TOKEN}" \\ --node-ip "\${MYCELIUM_IP}" \\ --advertise-address "\${MYCELIUM_IP}" \\ --node-external-ip "\${MYCELIUM_IP}" \\ --flannel-iface mycelium \\ --flannel-ipv6-masq \\ --cluster-cidr "10.42.0.0/16,fd42::/48" \\ --service-cidr "10.43.0.0/16,fd43::/112" \\ --cluster-dns "10.43.0.10" \\ --disable servicelb \\ --disable traefik \\ --tls-san "\${MYCELIUM_IP}" \\ --node-label "node-role.kubernetes.io/master=true" \\ --node-taint "node-role.kubernetes.io/master=true:NoSchedule" \\ --write-kubeconfig-mode 644 echo "Waiting for node to join..." sleep 15 EOFMASTER echo "" done # Check cluster status export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml echo "=== Cluster Status ===" kubectl get nodes ``` --- ## Step 5: Join Worker Nodes ```bash #!/bin/bash # Run from workstation - joins workers source ~/k3s-cluster/inventory.env for WORKER_IP in $WORKER1 $WORKER2; do echo "==========================================" echo "Joining Worker: $WORKER_IP" echo "==========================================" ssh ${SSH_USER}@${WORKER_IP} << EOFWORKER set -e MYCELIUM_IP=\$(ip -6 addr show dev mycelium scope global | grep -oP '(?<=inet6 )[0-9a-f:]+(?=/)') echo "Node IP: \$MYCELIUM_IP" echo "Joining via: ${MASTER1}" curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="agent" sh -s - \\ --server "https://[${MASTER1}]:6443" \\ --token "${K3S_TOKEN}" \\ --node-ip "\${MYCELIUM_IP}" \\ --node-external-ip "\${MYCELIUM_IP}" \\ --flannel-iface mycelium echo "Worker joined!" EOFWORKER echo "" done # Final status export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml echo "=== Final Cluster Status ===" kubectl get nodes -o wide ``` --- ## Step 6: Verification Tests ### 6.1 All-in-One Test Script ```bash #!/bin/bash # Comprehensive cluster verification - run from workstation source ~/k3s-cluster/inventory.env export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml echo "╔═══════════════════════════════════════════════════════════════╗" echo "║ K3s Cluster Verification Tests ║" echo "╚═══════════════════════════════════════════════════════════════╝" # Test 1: Node Status echo -e "\n[TEST 1] Node Status" echo "─────────────────────────────────────────" kubectl get nodes -o wide READY_NODES=$(kubectl get nodes --no-headers | grep -c " Ready") TOTAL_NODES=$(kubectl get nodes --no-headers | wc -l) if [ "$READY_NODES" -eq 5 ]; then echo "✓ All 5 nodes Ready" else echo "✗ Only $READY_NODES/$TOTAL_NODES nodes Ready" fi # Test 2: System Pods echo -e "\n[TEST 2] System Pods" echo "─────────────────────────────────────────" kubectl get pods -n kube-system NOT_RUNNING=$(kubectl get pods -n kube-system --no-headers | grep -v "Running\|Completed" | wc -l) if [ "$NOT_RUNNING" -eq 0 ]; then echo "✓ All system pods healthy" else echo "✗ $NOT_RUNNING pods not running" fi # Test 3: Mycelium Connectivity echo -e "\n[TEST 3] Mycelium IPv6 Connectivity" echo "─────────────────────────────────────────" ALL_VMS="$MASTER1 $MASTER2 $MASTER3 $WORKER1 $WORKER2" for VM in $ALL_VMS; do echo -n " Ping $VM: " if ping6 -c 1 -W 2 "$VM" > /dev/null 2>&1; then echo "✓" else echo "✗" fi done # Test 4: K3s API on all masters echo -e "\n[TEST 4] K3s API Endpoints" echo "─────────────────────────────────────────" for MASTER_IP in $MASTER1 $MASTER2 $MASTER3; do echo -n " API at [$MASTER_IP]:6443: " if curl -sk --connect-timeout 3 "https://[$MASTER_IP]:6443/healthz" 2>/dev/null | grep -q ok; then echo "✓" else echo "✗" fi done # Test 5: Deploy test pods echo -e "\n[TEST 5] Deploying Test Workload" echo "─────────────────────────────────────────" cat << 'EOF' | kubectl apply -f - apiVersion: v1 kind: Namespace metadata: name: k3s-test --- apiVersion: apps/v1 kind: DaemonSet metadata: name: network-test namespace: k3s-test spec: selector: matchLabels: app: network-test template: metadata: labels: app: network-test spec: tolerations: - operator: "Exists" containers: - name: test image: busybox:latest command: ["sleep", "3600"] resources: limits: memory: "32Mi" cpu: "50m" EOF echo "Waiting for test pods..." kubectl wait --for=condition=ready pod -l app=network-test -n k3s-test --timeout=120s 2>/dev/null kubectl get pods -n k3s-test -o wide # Test 6: Pod-to-Pod Communication echo -e "\n[TEST 6] Pod-to-Pod Communication" echo "─────────────────────────────────────────" SOURCE_POD=$(kubectl get pods -n k3s-test -o jsonpath='{.items[0].metadata.name}') POD_IPS=$(kubectl get pods -n k3s-test -o jsonpath='{range .items[*]}{.status.podIP}{" "}{end}') for POD_IP in $POD_IPS; do echo -n " Ping $POD_IP: " if kubectl exec -n k3s-test "$SOURCE_POD" -- ping -c 1 -W 2 "$POD_IP" > /dev/null 2>&1; then echo "✓" else echo "✗" fi done # Test 7: Internet Access (Masquerading) echo -e "\n[TEST 7] Internet Access (Masquerading)" echo "─────────────────────────────────────────" echo -n " DNS Resolution: " if kubectl exec -n k3s-test "$SOURCE_POD" -- nslookup google.com > /dev/null 2>&1; then echo "✓" else echo "✗" fi echo -n " HTTP Access: " if kubectl exec -n k3s-test "$SOURCE_POD" -- wget -q -O /dev/null --timeout=5 http://google.com 2>/dev/null; then echo "✓" else echo "✗" fi echo -n " HTTPS Access: " if kubectl exec -n k3s-test "$SOURCE_POD" -- wget -q -O /dev/null --timeout=5 https://google.com 2>/dev/null; then echo "✓" else echo "✗" fi # Test 8: Service Discovery echo -e "\n[TEST 8] Service Discovery" echo "─────────────────────────────────────────" cat << 'EOF' | kubectl apply -f - apiVersion: v1 kind: Service metadata: name: test-svc namespace: k3s-test spec: selector: app: network-test ports: - port: 80 EOF SVC_IP=$(kubectl get svc test-svc -n k3s-test -o jsonpath='{.spec.clusterIP}') echo " Service IP: $SVC_IP" echo -n " DNS lookup test-svc.k3s-test.svc.cluster.local: " if kubectl exec -n k3s-test "$SOURCE_POD" -- nslookup test-svc.k3s-test.svc.cluster.local > /dev/null 2>&1; then echo "✓" else echo "✗" fi # Summary echo -e "\n╔═══════════════════════════════════════════════════════════════╗" echo "║ Test Summary ║" echo "╚═══════════════════════════════════════════════════════════════╝" echo "Cluster: $(kubectl get nodes --no-headers | grep -c Ready)/5 nodes ready" echo "System: $(kubectl get pods -n kube-system --no-headers | grep -c Running) pods running" echo "Test: $(kubectl get pods -n k3s-test --no-headers | grep -c Running) test pods deployed" echo "" echo "Kubeconfig: ~/k3s-cluster/kubeconfig.yaml" ``` ### 6.2 Test HA Failover ```bash #!/bin/bash # Test HA by stopping a master source ~/k3s-cluster/inventory.env export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml echo "=== HA Failover Test ===" echo "" echo "Current state:" kubectl get nodes echo "" echo "Stopping K3s on Master 2..." ssh ${SSH_USER}@${MASTER2} "systemctl stop k3s" echo "Waiting 10 seconds..." sleep 10 echo "" echo "Cluster state with Master 2 down:" kubectl get nodes echo "" echo "Testing API still works:" kubectl get pods -n kube-system --no-headers | head -5 echo "" echo "Restarting Master 2..." ssh ${SSH_USER}@${MASTER2} "systemctl start k3s" echo "Waiting 20 seconds for rejoin..." sleep 20 echo "" echo "Final cluster state:" kubectl get nodes ``` --- ## Step 7: Cleanup Test Resources ```bash #!/bin/bash # Clean up test namespace export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml kubectl delete namespace k3s-test echo "Test resources cleaned up" ``` --- ## Quick Reference Commands ```bash # Set kubeconfig export KUBECONFIG=~/k3s-cluster/kubeconfig.yaml # View cluster kubectl get nodes -o wide kubectl get pods -A # SSH to a VM source ~/k3s-cluster/inventory.env ssh root@$MASTER1 # View k3s logs on a VM ssh root@$MASTER1 "journalctl -u k3s -f" # Check token (if lost) ssh root@$MASTER1 "cat /var/lib/rancher/k3s/server/token" # Uninstall k3s from a master ssh root@$MASTER1 "/usr/local/bin/k3s-uninstall.sh" # Uninstall k3s from a worker ssh root@$WORKER1 "/usr/local/bin/k3s-agent-uninstall.sh" ``` --- ## Troubleshooting | Issue | Debug Command | Solution | |-------|--------------|----------| | Node not joining | `ssh VM "journalctl -u k3s -n 50"` | Check token, verify IPv6 connectivity | | Pods can't reach internet | `ssh VM "nft list ruleset"` | Verify masquerade rules, check public interface | | Flannel errors | `ssh VM "cat /var/lib/rancher/k3s/agent/etc/flannel/net-conf.json"` | Verify `flannel-iface mycelium` | | DNS not working | `kubectl logs -n kube-system -l k8s-app=kube-dns` | Check coredns pods | | etcd issues | `ssh MASTER "journalctl -u k3s \| grep etcd"` | Ensure time is synced across masters |
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Depends on
#1 VM installer using cloudhypervisor
geomind_research/herolib_rust
Reference
geomind_research/herolib_rust#2
No description provided.