Skip to content

n8n 部署与运维 - 生产环境管理

当我们的 n8n 工作流从开发环境走向生产环境时,就需要考虑更多的因素:稳定性、可扩展性、安全性和可维护性。今天我们来聊聊如何在生产环境中部署和运维 n8n。

生产环境部署策略

部署架构选择

在生产环境中,我们有几种部署方式可以选择:

单机部署 适合小团队或者初期使用,成本低但扩展性有限。

集群部署 适合大型团队或高并发场景,可以水平扩展但配置相对复杂。

云服务部署 使用云厂商的容器服务,运维成本低但可能有厂商锁定风险。

环境规划

bash
# 典型的三环境架构
开发环境 (Development)  → 测试环境 (Staging)  → 生产环境 (Production)

   快速迭代                  功能验证                稳定运行
   数据隔离                  性能测试                高可用性

Docker 容器化部署

Docker 是目前最流行的容器化方案,可以让我们的 n8n 应用在不同环境中保持一致性。

基础 Dockerfile

dockerfile
FROM n8nio/n8n:latest

# 设置工作目录
WORKDIR /home/node

# 复制自定义配置
COPY config/ ./config/
COPY custom-nodes/ ./custom-nodes/

# 设置环境变量
ENV N8N_BASIC_AUTH_ACTIVE=true
ENV N8N_BASIC_AUTH_USER=admin
ENV N8N_ENCRYPTION_KEY=your-encryption-key

# 暴露端口
EXPOSE 5678

# 启动命令
CMD ["n8n", "start"]

Docker Compose 配置

yaml
version: '3.8'
services:
  n8n:
    image: n8nio/n8n:latest
    restart: unless-stopped
    ports:
      - "5678:5678"
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=${N8N_USER}
      - N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
      - N8N_HOST=${N8N_HOST}
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://${N8N_HOST}/
      - GENERIC_TIMEZONE=${TIMEZONE}
    volumes:
      - n8n_data:/home/node/.n8n
      - ./custom-nodes:/home/node/.n8n/custom
    depends_on:
      - postgres
      - redis

  postgres:
    image: postgres:13
    restart: unless-stopped
    environment:
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:6-alpine
    restart: unless-stopped
    volumes:
      - redis_data:/data

volumes:
  n8n_data:
  postgres_data:
  redis_data:

Kubernetes 集群部署

对于需要高可用和自动扩展的场景,Kubernetes 是更好的选择。

Deployment 配置

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: n8n-deployment
  labels:
    app: n8n
spec:
  replicas: 3
  selector:
    matchLabels:
      app: n8n
  template:
    metadata:
      labels:
        app: n8n
    spec:
      containers:
      - name: n8n
        image: n8nio/n8n:latest
        ports:
        - containerPort: 5678
        env:
        - name: N8N_BASIC_AUTH_ACTIVE
          value: "true"
        - name: N8N_BASIC_AUTH_USER
          valueFrom:
            secretKeyRef:
              name: n8n-secret
              key: username
        - name: N8N_BASIC_AUTH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: n8n-secret
              key: password
        - name: DB_TYPE
          value: "postgresdb"
        - name: DB_POSTGRESDB_HOST
          value: "postgres-service"
        - name: DB_POSTGRESDB_DATABASE
          value: "n8n"
        - name: DB_POSTGRESDB_USER
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: username
        - name: DB_POSTGRESDB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        volumeMounts:
        - name: n8n-data
          mountPath: /home/node/.n8n
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
      volumes:
      - name: n8n-data
        persistentVolumeClaim:
          claimName: n8n-pvc

Service 配置

yaml
apiVersion: v1
kind: Service
metadata:
  name: n8n-service
spec:
  selector:
    app: n8n
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5678
  type: LoadBalancer

负载均衡配置

当我们有多个 n8n 实例时,需要配置负载均衡来分发请求。

Nginx 配置

nginx
upstream n8n_backend {
    least_conn;
    server n8n-1:5678 max_fails=3 fail_timeout=30s;
    server n8n-2:5678 max_fails=3 fail_timeout=30s;
    server n8n-3:5678 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name your-n8n-domain.com;

    # 重定向到 HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name your-n8n-domain.com;

    # SSL 证书配置
    ssl_certificate /etc/ssl/certs/your-cert.pem;
    ssl_certificate_key /etc/ssl/private/your-key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;

    # 安全头
    add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection "1; mode=block";

    location / {
        proxy_pass http://n8n_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket 支持
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";

        # 超时设置
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    # 健康检查
    location /health {
        access_log off;
        proxy_pass http://n8n_backend/healthz;
    }
}


## 数据库配置

生产环境中,数据库的稳定性和性能直接影响整个系统的可用性。我们需要仔细配置数据库以确保最佳性能。

### PostgreSQL 配置

PostgreSQL 是 n8n 推荐的生产数据库,相比 SQLite 有更好的并发性能和数据完整性。

**基础配置**
```bash
# 环境变量配置
export DB_TYPE=postgresdb
export DB_POSTGRESDB_HOST=localhost
export DB_POSTGRESDB_PORT=5432
export DB_POSTGRESDB_DATABASE=n8n
export DB_POSTGRESDB_USER=n8n_user
export DB_POSTGRESDB_PASSWORD=secure_password
export DB_POSTGRESDB_SCHEMA=public

PostgreSQL 优化配置

sql
-- postgresql.conf 优化配置
shared_buffers = 256MB                    # 共享缓冲区
effective_cache_size = 1GB                # 有效缓存大小
work_mem = 4MB                           # 工作内存
maintenance_work_mem = 64MB              # 维护工作内存
checkpoint_completion_target = 0.9       # 检查点完成目标
wal_buffers = 16MB                       # WAL 缓冲区
default_statistics_target = 100          # 统计信息目标

-- 连接池配置
max_connections = 200                    # 最大连接数
shared_preload_libraries = 'pg_stat_statements'  # 预加载库

-- 日志配置
log_statement = 'mod'                    # 记录修改语句
log_min_duration_statement = 1000       # 记录慢查询(1秒以上)
log_checkpoints = on                     # 记录检查点
log_connections = on                     # 记录连接
log_disconnections = on                  # 记录断开连接

数据库初始化脚本

sql
-- 创建数据库和用户
CREATE DATABASE n8n;
CREATE USER n8n_user WITH ENCRYPTED PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE n8n TO n8n_user;

-- 创建必要的扩展
\c n8n;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";

-- 设置用户权限
GRANT ALL ON SCHEMA public TO n8n_user;
GRANT ALL ON ALL TABLES IN SCHEMA public TO n8n_user;
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO n8n_user;

Redis 缓存配置

Redis 主要用于队列管理和缓存,可以显著提升系统性能。

Redis 配置文件

conf
# redis.conf
bind 127.0.0.1
port 6379
timeout 300
keepalive 60

# 内存配置
maxmemory 512mb
maxmemory-policy allkeys-lru

# 持久化配置
save 900 1
save 300 10
save 60 10000

# 日志配置
loglevel notice
logfile /var/log/redis/redis-server.log

# 安全配置
requirepass your_redis_password

n8n Redis 集成

bash
# 环境变量配置
export QUEUE_BULL_REDIS_HOST=localhost
export QUEUE_BULL_REDIS_PORT=6379
export QUEUE_BULL_REDIS_PASSWORD=your_redis_password
export QUEUE_BULL_REDIS_DB=0

# 队列配置
export EXECUTIONS_MODE=queue
export QUEUE_BULL_REDIS_TIMEOUT_THRESHOLD=30000

数据库备份策略

自动备份脚本

bash
#!/bin/bash
# backup_n8n.sh

# 配置变量
DB_NAME="n8n"
DB_USER="n8n_user"
BACKUP_DIR="/backup/n8n"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="n8n_backup_${DATE}.sql"

# 创建备份目录
mkdir -p $BACKUP_DIR

# 执行备份
pg_dump -h localhost -U $DB_USER -d $DB_NAME > $BACKUP_DIR/$BACKUP_FILE

# 压缩备份文件
gzip $BACKUP_DIR/$BACKUP_FILE

# 删除7天前的备份
find $BACKUP_DIR -name "*.gz" -mtime +7 -delete

# 记录日志
echo "$(date): Backup completed - $BACKUP_FILE.gz" >> /var/log/n8n_backup.log

定时备份任务

bash
# 添加到 crontab
# 每天凌晨2点执行备份
0 2 * * * /path/to/backup_n8n.sh

# 每周日凌晨1点执行完整备份
0 1 * * 0 /path/to/full_backup_n8n.sh

监控与日志

生产环境的监控是保证系统稳定运行的关键。我们需要监控应用性能、资源使用情况和业务指标。

应用监控

Prometheus + Grafana 监控方案

首先配置 Prometheus 来收集 n8n 的指标:

yaml
# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'n8n'
    static_configs:
      - targets: ['localhost:5678']
    metrics_path: '/metrics'
    scrape_interval: 30s

  - job_name: 'postgres'
    static_configs:
      - targets: ['localhost:9187']

  - job_name: 'redis'
    static_configs:
      - targets: ['localhost:9121']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']

自定义监控指标

javascript
// 在 n8n 中添加自定义指标
const prometheus = require('prom-client');

// 创建指标
const workflowExecutions = new prometheus.Counter({
  name: 'n8n_workflow_executions_total',
  help: 'Total number of workflow executions',
  labelNames: ['workflow_name', 'status']
});

const executionDuration = new prometheus.Histogram({
  name: 'n8n_workflow_execution_duration_seconds',
  help: 'Duration of workflow executions',
  labelNames: ['workflow_name'],
  buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 300]
});

// 在工作流执行时记录指标
function recordWorkflowExecution(workflowName, status, duration) {
  workflowExecutions.inc({ workflow_name: workflowName, status: status });
  executionDuration.observe({ workflow_name: workflowName }, duration);
}

Grafana 仪表板配置

json
{
  "dashboard": {
    "title": "n8n 监控仪表板",
    "panels": [
      {
        "title": "工作流执行次数",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(rate(n8n_workflow_executions_total[5m]))",
            "legendFormat": "执行/秒"
          }
        ]
      },
      {
        "title": "工作流成功率",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(rate(n8n_workflow_executions_total{status=\"success\"}[5m])) / sum(rate(n8n_workflow_executions_total[5m])) * 100",
            "legendFormat": "成功率 %"
          }
        ]
      },
      {
        "title": "平均执行时间",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(n8n_workflow_execution_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          }
        ]
      }
    ]
  }
}

日志管理

结构化日志配置

javascript
// 使用 Winston 进行日志管理
const winston = require('winston');

const logger = winston.createLogger({
  level: process.env.LOG_LEVEL || 'info',
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.errors({ stack: true }),
    winston.format.json()
  ),
  defaultMeta: { service: 'n8n' },
  transports: [
    new winston.transports.File({
      filename: '/var/log/n8n/error.log',
      level: 'error'
    }),
    new winston.transports.File({
      filename: '/var/log/n8n/combined.log'
    }),
    new winston.transports.Console({
      format: winston.format.simple()
    })
  ]
});

// 工作流执行日志
function logWorkflowExecution(workflowId, executionId, status, error = null) {
  const logData = {
    workflowId,
    executionId,
    status,
    timestamp: new Date().toISOString()
  };

  if (error) {
    logData.error = {
      message: error.message,
      stack: error.stack
    };
    logger.error('Workflow execution failed', logData);
  } else {
    logger.info('Workflow execution completed', logData);
  }
}

ELK Stack 集成

yaml
# docker-compose.yml 中添加 ELK
version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
    environment:
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    volumes:
      - elasticsearch_data:/usr/share/elasticsearch/data

  logstash:
    image: docker.elastic.co/logstash/logstash:7.14.0
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:7.14.0
    ports:
      - "5601:5601"
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    depends_on:
      - elasticsearch

  filebeat:
    image: docker.elastic.co/beats/filebeat:7.14.0
    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml
      - /var/log/n8n:/var/log/n8n:ro
    depends_on:
      - logstash

日志轮转配置

bash
# /etc/logrotate.d/n8n
/var/log/n8n/*.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    create 644 n8n n8n
    postrotate
        systemctl reload n8n
    endscript
}

健康检查

应用健康检查端点

javascript
// 健康检查路由
app.get('/health', async (req, res) => {
  const health = {
    status: 'healthy',
    timestamp: new Date().toISOString(),
    checks: {}
  };

  try {
    // 数据库连接检查
    await checkDatabaseConnection();
    health.checks.database = { status: 'healthy' };
  } catch (error) {
    health.checks.database = {
      status: 'unhealthy',
      error: error.message
    };
    health.status = 'unhealthy';
  }

  try {
    // Redis 连接检查
    await checkRedisConnection();
    health.checks.redis = { status: 'healthy' };
  } catch (error) {
    health.checks.redis = {
      status: 'unhealthy',
      error: error.message
    };
    health.status = 'unhealthy';
  }

  // 内存使用检查
  const memUsage = process.memoryUsage();
  const memUsageMB = Math.round(memUsage.heapUsed / 1024 / 1024);
  health.checks.memory = {
    status: memUsageMB < 1000 ? 'healthy' : 'warning',
    usage: `${memUsageMB}MB`
  };

  const statusCode = health.status === 'healthy' ? 200 : 503;
  res.status(statusCode).json(health);
});

Kubernetes 健康检查

yaml
# 在 Deployment 中添加健康检查
spec:
  containers:
  - name: n8n
    image: n8nio/n8n:latest
    livenessProbe:
      httpGet:
        path: /health
        port: 5678
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /health
        port: 5678
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 3
      failureThreshold: 3

备份与恢复

数据是企业的生命线,完善的备份和恢复策略能够在关键时刻挽救整个业务。

数据备份策略

分层备份方案

我们需要对不同类型的数据采用不同的备份策略:

bash
#!/bin/bash
# comprehensive_backup.sh - 综合备份脚本

# 配置变量
BACKUP_ROOT="/backup"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

# 数据库备份
backup_database() {
    echo "开始数据库备份..."

    # 创建备份目录
    mkdir -p $BACKUP_ROOT/database/$DATE

    # 备份主数据库
    pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME \
        --verbose --clean --no-owner --no-privileges \
        > $BACKUP_ROOT/database/$DATE/n8n_database.sql

    # 压缩备份文件
    cd $BACKUP_ROOT/database/$DATE
    tar -czf ../n8n_db_backup_$DATE.tar.gz .
    cd - > /dev/null

    echo "数据库备份完成"
}

# 工作流文件备份
backup_workflows() {
    echo "开始工作流文件备份..."

    tar -czf $BACKUP_ROOT/workflows/n8n_workflows_$DATE.tar.gz \
        -C /home/node/.n8n \
        --exclude='logs/*' \
        --exclude='temp/*' \
        .

    echo "工作流文件备份完成"
}

# 执行备份
main() {
    echo "开始 n8n 系统备份 - $(date)"

    backup_database
    backup_workflows

    echo "备份完成 - $(date)"
}

main

灾难恢复方案

恢复流程文档

bash
#!/bin/bash
# disaster_recovery.sh - 灾难恢复脚本

# 数据库恢复
recover_database() {
    echo "恢复数据库..."

    # 停止 n8n 服务
    docker-compose down

    # 创建新数据库
    createdb -h $DB_HOST -U postgres $DB_NAME

    # 恢复数据
    psql -h $DB_HOST -U $DB_USER -d $DB_NAME < $BACKUP_FILE

    echo "数据库恢复完成"
}

# 验证恢复结果
verify_recovery() {
    echo "验证恢复结果..."

    # 检查数据库连接
    if psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;" > /dev/null 2>&1; then
        echo "✓ 数据库连接正常"
    else
        echo "✗ 数据库连接失败"
        exit 1
    fi

    echo "恢复验证完成"
}

小结

生产环境的部署和运维是一个系统工程,需要考虑多个方面:

  1. 部署策略:选择合适的部署方式,确保高可用性
  2. 数据库配置:优化数据库性能,确保数据安全
  3. 监控体系:建立完善的监控和告警机制
  4. 备份恢复:制定完善的备份策略和灾难恢复方案
  5. 运维实践:建立标准化的运维流程

记住,运维工作需要持续改进。定期回顾和优化部署架构,根据业务增长调整资源配置,这样才能保证系统的长期稳定运行。

下一篇文章我们将学习团队协作与版本控制,了解如何在团队环境中高效使用 n8n。