游戏微服务的部署与Docker化实践
我将详细讲解如何将游戏微服务进行Docker化,并使用编排工具进行部署管理。
一、Docker化基础
1. 为什么需要Docker化?
对于游戏微服务架构,Docker提供了:
- 环境一致性:开发、测试、生产环境完全一致
- 快速部署:秒级启动,分钟级全集群部署
- 资源隔离:CPU、内存、网络、磁盘隔离
- 弹性伸缩:根据负载快速扩缩容
- 版本管理:镜像版本控制,轻松回滚
2. 游戏服务Docker化的挑战
1 2 3 4 5
| 挑战: 1. 网络延迟敏感(特别是战斗服务) 2. 长连接管理(网关服务) 3. 状态服务(需要持久化数据) 4. 高性能要求(需要接近裸机性能)
|
二、Dockerfile最佳实践
1. 多阶段构建示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
|
FROM golang:1.21-alpine AS builder
WORKDIR /app COPY go.mod go.sum ./ RUN go mod download
COPY . . RUN CGO_ENABLED=0 GOOS=linux go build \ -ldflags="-w -s" \ -o activity-service \ ./cmd/server
FROM alpine:3.18
RUN apk --no-cache add \ ca-certificates \ tzdata \ curl \ && update-ca-certificates
RUN addgroup -g 1001 -S gamesvr && \ adduser -u 1001 -S gamesvr -G gamesvr
WORKDIR /app
COPY --from=builder --chown=gamesvr:gamesvr /app/activity-service . COPY --from=builder --chown=gamesvr:gamesvr /app/configs ./configs COPY --from=builder --chown=gamesvr:gamesvr /app/scripts/entrypoint.sh .
RUN chmod +x activity-service entrypoint.sh
USER gamesvr
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1
EXPOSE 8080 50051
ENTRYPOINT ["./entrypoint.sh"]
|
2. 不同类型服务的Dockerfile差异
1 2 3 4 5 6
| FROM golang:1.21-alpine AS builder
RUN apk add --no-cache gcc musl-dev linux-headers ENV CGO_ENABLED=1 ...
|
3. 基础镜像优化
1 2 3 4 5 6 7
| FROM gcr.io/distroless/static-debian11
FROM scratch COPY --from=builder /app/activity-service / COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
|
三、Docker Compose本地开发环境
1. 完整的游戏开发环境编排
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270
| version: '3.8'
services: zookeeper: image: zookeeper:3.8 container_name: game-zookeeper restart: unless-stopped ports: - "2181:2181" environment: ZOO_MY_ID: 1 volumes: - zookeeper_data:/data - zookeeper_datalog:/datalog networks: - game-network
kafka: image: confluentinc/cp-kafka:7.4.0 container_name: game-kafka restart: unless-stopped depends_on: - zookeeper ports: - "9092:9092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:9092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 volumes: - kafka_data:/var/lib/kafka/data networks: - game-network
redis-cluster: image: bitnami/redis-cluster:7.0 container_name: game-redis restart: unless-stopped ports: - "6379:6379" - "16379:16379" environment: - REDIS_PASSWORD=game_redis_pass - REDIS_NODES=redis-cluster - REDIS_CLUSTER_REPLICAS=1 volumes: - redis_data:/bitnami/redis/data networks: - game-network command: > /opt/bitnami/scripts/redis-cluster/entrypoint.sh /opt/bitnami/scripts/redis-cluster/run.sh
mysql: image: mysql:8.0 container_name: game-mysql restart: unless-stopped command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci environment: MYSQL_ROOT_PASSWORD: game_root_pass MYSQL_DATABASE: game_platform MYSQL_USER: game_user MYSQL_PASSWORD: game_user_pass ports: - "3306:3306" volumes: - mysql_data:/var/lib/mysql - ./init-scripts:/docker-entrypoint-initdb.d networks: - game-network healthcheck: test: ["CMD", "mysqladmin", "ping", "-h", "localhost"] interval: 10s timeout: 5s retries: 5
nacos: image: nacos/nacos-server:v2.2.3 container_name: game-nacos restart: unless-stopped environment: - MODE=standalone - SPRING_DATASOURCE_PLATFORM=mysql - MYSQL_SERVICE_HOST=mysql - MYSQL_SERVICE_DB_NAME=nacos_config - MYSQL_SERVICE_PORT=3306 - MYSQL_SERVICE_USER=game_user - MYSQL_SERVICE_PASSWORD=game_user_pass - NACOS_AUTH_ENABLE=true - NACOS_AUTH_USERNAME=nacos - NACOS_AUTH_PASSWORD=nacos ports: - "8848:8848" - "9848:9848" depends_on: mysql: condition: service_healthy volumes: - nacos_data:/home/nacos/data networks: - game-network
gateway: build: context: ./gateway dockerfile: Dockerfile.dev container_name: game-gateway restart: unless-stopped ports: - "8000:8000" - "8001:8001" - "9000:9000" environment: - APP_PROFILE=dev - NACOS_HOST=nacos - NACOS_PORT=8848 - REDIS_HOST=redis-cluster - REDIS_PORT=6379 volumes: - ./gateway:/app - /app/node_modules depends_on: - nacos - redis-cluster networks: - game-network deploy: resources: limits: cpus: '1' memory: 1G reservations: cpus: '0.5' memory: 512M
activity-service: build: context: ./activity-service dockerfile: Dockerfile.dev container_name: game-activity restart: unless-stopped ports: - "50051:50051" - "8081:8080" environment: - APP_PROFILE=dev - NACOS_HOST=nacos - NACOS_PORT=8848 - DB_HOST=mysql - DB_PORT=3306 - REDIS_HOST=redis-cluster - KAFKA_HOST=kafka volumes: - ./activity-service:/app depends_on: - nacos - mysql - redis-cluster - kafka networks: - game-network
matching-service: build: context: ./matching-service dockerfile: Dockerfile.dev container_name: game-matching restart: unless-stopped environment: - APP_PROFILE=dev - NACOS_HOST=nacos depends_on: - nacos networks: - game-network
prometheus: image: prom/prometheus:latest container_name: game-prometheus restart: unless-stopped ports: - "9090:9090" volumes: - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml - prometheus_data:/prometheus command: - '--config.file=/etc/prometheus/prometheus.yml' - '--storage.tsdb.path=/prometheus' - '--web.console.libraries=/etc/prometheus/console_libraries' - '--web.console.templates=/etc/prometheus/consoles' - '--storage.tsdb.retention.time=200h' - '--web.enable-lifecycle' networks: - game-network
grafana: image: grafana/grafana:latest container_name: game-grafana restart: unless-stopped ports: - "3000:3000" environment: - GF_SECURITY_ADMIN_PASSWORD=admin - GF_INSTALL_PLUGINS=grafana-piechart-panel volumes: - grafana_data:/var/lib/grafana - ./monitoring/dashboards:/etc/grafana/provisioning/dashboards - ./monitoring/datasources:/etc/grafana/provisioning/datasources depends_on: - prometheus networks: - game-network
loki: image: grafana/loki:2.8.0 container_name: game-loki restart: unless-stopped ports: - "3100:3100" command: -config.file=/etc/loki/local-config.yaml volumes: - loki_data:/loki - ./monitoring/loki-config.yaml:/etc/loki/local-config.yaml networks: - game-network
promtail: image: grafana/promtail:2.8.0 container_name: game-promtail restart: unless-stopped volumes: - /var/run/docker.sock:/var/run/docker.sock - ./monitoring/promtail-config.yaml:/etc/promtail/config.yaml - ./activity-service/logs:/var/log/activity-service - ./gateway/logs:/var/log/gateway command: - -config.file=/etc/promtail/config.yaml depends_on: - loki networks: - game-network
networks: game-network: driver: bridge ipam: config: - subnet: 172.20.0.0/16 gateway: 172.20.0.1
volumes: zookeeper_data: zookeeper_datalog: kafka_data: redis_data: mysql_data: nacos_data: prometheus_data: grafana_data: loki_data:
|
2. 开发环境启动脚本
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
| #!/bin/bash
set -e
echo "🚀 启动游戏开发环境..."
if ! command -v docker &> /dev/null; then echo "❌ 请先安装Docker" exit 1 fi
if ! command -v docker-compose &> /dev/null; then echo "❌ 请先安装Docker Compose" exit 1 fi
mkdir -p logs/{gateway,activity,matching} mkdir -p data/{mysql,redis,kafka}
export COMPOSE_PROJECT_NAME=game_dev export COMPOSE_DOCKER_CLI_BUILD=1 export DOCKER_BUILDKIT=1
echo "📦 构建Docker镜像..." docker-compose build --parallel --compress
echo "🚢 启动服务..." docker-compose up -d
echo "⏳ 等待服务就绪..." sleep 10
echo "🔍 检查服务状态..." docker-compose ps
echo "" echo "✅ 开发环境启动完成!" echo "========================" echo "🌐 访问地址:" echo "网关服务: ws://localhost:8000" echo "Nacos配置中心: http://localhost:8848/nacos (账号: nacos/nacos)" echo "Grafana监控: http://localhost:3000 (账号: admin/admin)" echo "Prometheus: http://localhost:9090" echo "" echo "📝 常用命令:" echo "查看日志: docker-compose logs -f [服务名]" echo "重启服务: docker-compose restart [服务名]" echo "停止环境: docker-compose down" echo "重建服务: docker-compose up -d --build [服务名]"
|
四、生产环境Kubernetes部署
1. Kubernetes部署文件结构
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| k8s/ ├── base/ # 基础配置 │ ├── namespace.yaml │ ├── configmap-base.yaml │ └── secrets-base.yaml ├── overlays/ # 环境配置 │ ├── dev/ │ │ ├── kustomization.yaml │ │ ├── configmap-env.yaml │ │ └── deployment-patch.yaml │ ├── staging/ │ └── prod/ ├── services/ # 服务定义 │ ├── activity/ │ │ ├── deployment.yaml │ │ ├── service.yaml │ │ ├── hpa.yaml │ │ └── pdb.yaml │ ├── gateway/ │ └── matching/ ├── infrastructure/ # 基础设施 │ ├── redis-cluster/ │ ├── mysql/ │ └── kafka/ └── monitoring/ # 监控 ├── prometheus/ └── grafana/
|
2. 生产环境部署配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
| apiVersion: apps/v1 kind: Deployment metadata: name: activity-service namespace: game-prod labels: app: activity-service component: game-backend tier: business spec: replicas: 3 revisionHistoryLimit: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: activity-service template: metadata: labels: app: activity-service version: v1.2.0 annotations: prometheus.io/scrape: "true" prometheus.io/port: "8080" prometheus.io/path: "/metrics" spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - activity-service topologyKey: kubernetes.io/hostname nodeSelector: node-type: game-server priorityClassName: high-priority containers: - name: activity-service image: registry.game.com/activity-service:v1.2.0 imagePullPolicy: Always ports: - name: http containerPort: 8080 protocol: TCP - name: grpc containerPort: 50051 protocol: TCP resources: requests: memory: "512Mi" cpu: "500m" limits: memory: "1Gi" cpu: "1000m" env: - name: APP_PROFILE value: "prod" - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP envFrom: - configMapRef: name: game-activity-config - secretRef: name: game-db-secret volumeMounts: - name: config-volume mountPath: /app/configs readOnly: true - name: logs-volume mountPath: /app/logs livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 8080 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 5 timeoutSeconds: 3 successThreshold: 1 failureThreshold: 3 lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 30"] securityContext: runAsUser: 1001 runAsGroup: 1001 readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: - ALL args: - "--log.format=json" - "--log.level=info" initContainers: - name: init-config image: busybox:1.28 command: ['sh', '-c', 'cp /app/config-templates/* /app/configs/'] volumeMounts: - name: config-template-volume mountPath: /app/config-templates - name: config-volume mountPath: /app/configs volumes: - name: config-template-volume configMap: name: activity-config-template - name: config-volume emptyDir: {} - name: logs-volume emptyDir: {} serviceAccountName: game-service-account dnsConfig: options: - name: ndots value: "2" - name: single-request-reopen
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| apiVersion: v1 kind: Service metadata: name: activity-service namespace: game-prod annotations: service.beta.kubernetes.io/aws-load-balancer-internal: "true" prometheus.io/scrape: "true" spec: selector: app: activity-service ports: - name: http port: 8080 targetPort: 8080 protocol: TCP - name: grpc port: 50051 targetPort: 50051 protocol: TCP - name: metrics port: 9100 targetPort: 9100 protocol: TCP type: ClusterIP sessionAffinity: None
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: activity-service-hpa namespace: game-prod spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: activity-service minReplicas: 3 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 - type: Pods pods: metric: name: active_connections target: type: AverageValue averageValue: 5000 behavior: scaleDown: stabilizationWindowSeconds: 300 policies: - type: Percent value: 10 periodSeconds: 60 scaleUp: stabilizationWindowSeconds: 60 policies: - type: Percent value: 100 periodSeconds: 60
|
3. 游戏专用优化配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
| apiVersion: v1 kind: ConfigMap metadata: name: game-kernel-params namespace: kube-system data: 99-game-optimizations.conf: | # 网络优化 net.core.rmem_max = 134217728 net.core.wmem_max = 134217728 net.ipv4.tcp_rmem = 4096 87380 134217728 net.ipv4.tcp_wmem = 4096 65536 134217728 net.ipv4.tcp_congestion_control = bbr net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_fin_timeout = 30 fs.file-max = 1000000 fs.nr_open = 1000000 net.netfilter.nf_conntrack_max = 1048576 net.nf_conntrack_max = 1048576 --- apiVersion: v1 kind: DaemonSet metadata: name: game-optimizer namespace: kube-system spec: selector: matchLabels: name: game-optimizer template: metadata: labels: name: game-optimizer spec: hostNetwork: true containers: - name: optimizer image: busybox:1.28 command: ["/bin/sh", "-c"] args: - | # 应用内核参数 sysctl -p /etc/sysctl.d/99-game-optimizations.conf # 持续运行 tail -f /dev/null volumeMounts: - name: sysctl-conf mountPath: /etc/sysctl.d securityContext: privileged: true volumes: - name: sysctl-conf configMap: name: game-kernel-params tolerations: - operator: Exists
|
五、CI/CD流水线配置
1. GitLab CI/CD示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
| stages: - test - build - scan - deploy-dev - deploy-staging - deploy-prod
variables: DOCKER_HOST: tcp://docker:2375 DOCKER_TLS_CERTDIR: "" IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
.activity-service: &activity-service variables: SERVICE_NAME: activity-service DOCKERFILE_PATH: Dockerfile before_script: - echo "Building $SERVICE_NAME" after_script: - echo "Cleanup..."
unit-test: stage: test script: - go test ./... -v -coverprofile=coverage.out - go tool cover -func=coverage.out artifacts: paths: - coverage.out reports: cobertura: coverage.xml only: - merge_requests - main - develop
code-quality: stage: scan image: sonarsource/sonar-scanner-cli:latest variables: SONAR_USER_HOME: "${CI_PROJECT_DIR}/.sonar" GIT_DEPTH: "0" cache: key: "${CI_JOB_NAME}" paths: - .sonar/cache script: - sonar-scanner allow_failure: true
build-activity-service: <<: *activity-service stage: build image: docker:20.10.16 services: - docker:20.10.16-dind script: - echo "$CI_REGISTRY_PASSWORD" | docker login $CI_REGISTRY -u $CI_REGISTRY_USER --password-stdin - docker build \ --build-arg VERSION=$CI_COMMIT_SHORT_SHA \ --build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ') \ -t $IMAGE_TAG \ -f $DOCKERFILE_PATH . - docker scan --accept-license --exclude-base $IMAGE_TAG || true - docker push $IMAGE_TAG - | if [ "$CI_COMMIT_BRANCH" == "main" ]; then docker tag $IMAGE_TAG $CI_REGISTRY_IMAGE:latest docker push $CI_REGISTRY_IMAGE:latest fi artifacts: reports: container_scanning: gl-container-scanning-report.json
deploy-dev: stage: deploy-dev image: bitnami/kubectl:latest environment: name: development url: https://dev.game.com script: - | kubectl config use-context dev-cluster kubectl set image deployment/activity-service \ activity-service=$IMAGE_TAG \ -n game-dev kubectl rollout status deployment/activity-service -n game-dev only: - develop
deploy-prod: stage: deploy-prod image: bitnami/kubectl:latest environment: name: production url: https://game.com script: - | # 使用Kustomize部署 kubectl config use-context prod-cluster cd k8s/overlays/prod kustomize edit set image activity-service=$IMAGE_TAG kubectl apply -k . kubectl rollout status deployment/activity-service -n game-prod --timeout=300s ./scripts/health-check.sh only: - main when: manual dependencies: - build-activity-service
|
2. 部署验证脚本
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
| #!/bin/bash
set -e
SERVICE_NAME="activity-service" NAMESPACE="game-prod" TIMEOUT=300 INTERVAL=10
echo "🔍 开始健康检查 $SERVICE_NAME..."
echo "检查Pod状态..." end=$((SECONDS+TIMEOUT)) while [ $SECONDS -lt $end ]; do READY=$(kubectl get deployment $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.status.readyReplicas}') DESIRED=$(kubectl get deployment $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.status.replicas}') if [ "$READY" == "$DESIRED" ] && [ "$READY" != "0" ]; then echo "✅ Pod全部就绪 ($READY/$DESIRED)" break fi echo "等待Pod就绪 ($READY/$DESIRED)..." sleep $INTERVAL done
if [ "$READY" != "$DESIRED" ]; then echo "❌ Pod未在规定时间内就绪" exit 1 fi
ENDPOINT=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.clusterIP}') HTTP_PORT=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.ports[?(@.name=="http")].port}')
echo "检查HTTP健康端点..." if curl -f http://$ENDPOINT:$HTTP_PORT/health; then echo "✅ 健康检查通过" else echo "❌ 健康检查失败" exit 1 fi
echo "检查gRPC健康状态..." GRPC_PORT=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}')
if ./bin/grpc_health_probe -addr=$ENDPOINT:$GRPC_PORT; then echo "✅ gRPC健康检查通过" else echo "❌ gRPC健康检查失败" exit 1 fi
echo "运行性能基准测试..." ab -n 1000 -c 100 http://$ENDPOINT:$HTTP_PORT/metrics > /dev/null 2>&1 if [ $? -eq 0 ]; then echo "✅ 性能基准测试通过" else echo "⚠️ 性能基准测试警告" fi
echo "🎉 $SERVICE_NAME 部署验证完成"
|
六、监控与日志收集
1. Docker日志配置
1 2 3 4 5 6 7 8 9
| { "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3", "labels": "service_name,environment", "tag": "{{.ImageName}}|{{.Name}}|{{.ID}}" } }
|
2. 结构化日志输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| import "github.com/sirupsen/logrus"
func initLogger() *logrus.Logger { logger := logrus.New() if env.IsProduction() { logger.SetFormatter(&logrus.JSONFormatter{ TimestampFormat: "2006-01-02T15:04:05.999Z07:00", FieldMap: logrus.FieldMap{ logrus.FieldKeyTime: "@timestamp", logrus.FieldKeyLevel: "level", logrus.FieldKeyMsg: "message", logrus.FieldKeyFunc: "function", }, }) } return logger }
logger.WithFields(logrus.Fields{ "service": "activity-service", "user_id": userID, "activity_id": activityID, "duration_ms": duration.Milliseconds(), "trace_id": traceID, }).Info("User joined activity")
|
七、最佳实践总结
1. 镜像优化
- 使用多阶段构建减小镜像大小
- 使用Alpine或distroless基础镜像
- 移除调试工具和包管理器
2. 安全强化
- 使用非root用户运行容器
- 设置只读根文件系统
- 移除不必要的Linux capabilities
- 扫描镜像中的安全漏洞
3. 性能优化
- 设置合理的CPU和内存限制
- 使用本地SSD存储关键数据
- 优化网络配置(TCP参数调整)
- 合理设置健康检查间隔
4. 可观测性
- 所有服务暴露/metrics端点
- 结构化日志输出JSON格式
- 分布式追踪集成
- 业务指标埋点
5. 游戏服务特殊考虑
- 长连接服务的优雅终止
- 状态服务的持久化存储
- 战斗服务的低延迟要求
- 网关服务的连接管理
通过以上Docker化和部署实践,可以构建一个稳定、可扩展、易维护的游戏微服务架构。