n8n 部署与运维 - 生产环境管理
当我们的 n8n 工作流从开发环境走向生产环境时,就需要考虑更多的因素:稳定性、可扩展性、安全性和可维护性。今天我们来聊聊如何在生产环境中部署和运维 n8n。
生产环境部署策略
部署架构选择
在生产环境中,我们有几种部署方式可以选择:
单机部署 适合小团队或者初期使用,成本低但扩展性有限。
集群部署 适合大型团队或高并发场景,可以水平扩展但配置相对复杂。
云服务部署 使用云厂商的容器服务,运维成本低但可能有厂商锁定风险。
环境规划
bash
# 典型的三环境架构
开发环境 (Development) → 测试环境 (Staging) → 生产环境 (Production)
↓ ↓ ↓
快速迭代 功能验证 稳定运行
数据隔离 性能测试 高可用性
Docker 容器化部署
Docker 是目前最流行的容器化方案,可以让我们的 n8n 应用在不同环境中保持一致性。
基础 Dockerfile
dockerfile
FROM n8nio/n8n:latest
# 设置工作目录
WORKDIR /home/node
# 复制自定义配置
COPY config/ ./config/
COPY custom-nodes/ ./custom-nodes/
# 设置环境变量
ENV N8N_BASIC_AUTH_ACTIVE=true
ENV N8N_BASIC_AUTH_USER=admin
ENV N8N_ENCRYPTION_KEY=your-encryption-key
# 暴露端口
EXPOSE 5678
# 启动命令
CMD ["n8n", "start"]
Docker Compose 配置
yaml
version: '3.8'
services:
n8n:
image: n8nio/n8n:latest
restart: unless-stopped
ports:
- "5678:5678"
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=${N8N_USER}
- N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
- N8N_HOST=${N8N_HOST}
- N8N_PORT=5678
- N8N_PROTOCOL=https
- NODE_ENV=production
- WEBHOOK_URL=https://${N8N_HOST}/
- GENERIC_TIMEZONE=${TIMEZONE}
volumes:
- n8n_data:/home/node/.n8n
- ./custom-nodes:/home/node/.n8n/custom
depends_on:
- postgres
- redis
postgres:
image: postgres:13
restart: unless-stopped
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:6-alpine
restart: unless-stopped
volumes:
- redis_data:/data
volumes:
n8n_data:
postgres_data:
redis_data:
Kubernetes 集群部署
对于需要高可用和自动扩展的场景,Kubernetes 是更好的选择。
Deployment 配置
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: n8n-deployment
labels:
app: n8n
spec:
replicas: 3
selector:
matchLabels:
app: n8n
template:
metadata:
labels:
app: n8n
spec:
containers:
- name: n8n
image: n8nio/n8n:latest
ports:
- containerPort: 5678
env:
- name: N8N_BASIC_AUTH_ACTIVE
value: "true"
- name: N8N_BASIC_AUTH_USER
valueFrom:
secretKeyRef:
name: n8n-secret
key: username
- name: N8N_BASIC_AUTH_PASSWORD
valueFrom:
secretKeyRef:
name: n8n-secret
key: password
- name: DB_TYPE
value: "postgresdb"
- name: DB_POSTGRESDB_HOST
value: "postgres-service"
- name: DB_POSTGRESDB_DATABASE
value: "n8n"
- name: DB_POSTGRESDB_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: DB_POSTGRESDB_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
volumeMounts:
- name: n8n-data
mountPath: /home/node/.n8n
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: n8n-data
persistentVolumeClaim:
claimName: n8n-pvc
Service 配置
yaml
apiVersion: v1
kind: Service
metadata:
name: n8n-service
spec:
selector:
app: n8n
ports:
- protocol: TCP
port: 80
targetPort: 5678
type: LoadBalancer
负载均衡配置
当我们有多个 n8n 实例时,需要配置负载均衡来分发请求。
Nginx 配置
nginx
upstream n8n_backend {
least_conn;
server n8n-1:5678 max_fails=3 fail_timeout=30s;
server n8n-2:5678 max_fails=3 fail_timeout=30s;
server n8n-3:5678 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
server_name your-n8n-domain.com;
# 重定向到 HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name your-n8n-domain.com;
# SSL 证书配置
ssl_certificate /etc/ssl/certs/your-cert.pem;
ssl_certificate_key /etc/ssl/private/your-key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
# 安全头
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
location / {
proxy_pass http://n8n_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket 支持
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# 超时设置
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
# 健康检查
location /health {
access_log off;
proxy_pass http://n8n_backend/healthz;
}
}
## 数据库配置
生产环境中,数据库的稳定性和性能直接影响整个系统的可用性。我们需要仔细配置数据库以确保最佳性能。
### PostgreSQL 配置
PostgreSQL 是 n8n 推荐的生产数据库,相比 SQLite 有更好的并发性能和数据完整性。
**基础配置**
```bash
# 环境变量配置
export DB_TYPE=postgresdb
export DB_POSTGRESDB_HOST=localhost
export DB_POSTGRESDB_PORT=5432
export DB_POSTGRESDB_DATABASE=n8n
export DB_POSTGRESDB_USER=n8n_user
export DB_POSTGRESDB_PASSWORD=secure_password
export DB_POSTGRESDB_SCHEMA=public
PostgreSQL 优化配置
sql
-- postgresql.conf 优化配置
shared_buffers = 256MB # 共享缓冲区
effective_cache_size = 1GB # 有效缓存大小
work_mem = 4MB # 工作内存
maintenance_work_mem = 64MB # 维护工作内存
checkpoint_completion_target = 0.9 # 检查点完成目标
wal_buffers = 16MB # WAL 缓冲区
default_statistics_target = 100 # 统计信息目标
-- 连接池配置
max_connections = 200 # 最大连接数
shared_preload_libraries = 'pg_stat_statements' # 预加载库
-- 日志配置
log_statement = 'mod' # 记录修改语句
log_min_duration_statement = 1000 # 记录慢查询(1秒以上)
log_checkpoints = on # 记录检查点
log_connections = on # 记录连接
log_disconnections = on # 记录断开连接
数据库初始化脚本
sql
-- 创建数据库和用户
CREATE DATABASE n8n;
CREATE USER n8n_user WITH ENCRYPTED PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE n8n TO n8n_user;
-- 创建必要的扩展
\c n8n;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";
-- 设置用户权限
GRANT ALL ON SCHEMA public TO n8n_user;
GRANT ALL ON ALL TABLES IN SCHEMA public TO n8n_user;
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO n8n_user;
Redis 缓存配置
Redis 主要用于队列管理和缓存,可以显著提升系统性能。
Redis 配置文件
conf
# redis.conf
bind 127.0.0.1
port 6379
timeout 300
keepalive 60
# 内存配置
maxmemory 512mb
maxmemory-policy allkeys-lru
# 持久化配置
save 900 1
save 300 10
save 60 10000
# 日志配置
loglevel notice
logfile /var/log/redis/redis-server.log
# 安全配置
requirepass your_redis_password
n8n Redis 集成
bash
# 环境变量配置
export QUEUE_BULL_REDIS_HOST=localhost
export QUEUE_BULL_REDIS_PORT=6379
export QUEUE_BULL_REDIS_PASSWORD=your_redis_password
export QUEUE_BULL_REDIS_DB=0
# 队列配置
export EXECUTIONS_MODE=queue
export QUEUE_BULL_REDIS_TIMEOUT_THRESHOLD=30000
数据库备份策略
自动备份脚本
bash
#!/bin/bash
# backup_n8n.sh
# 配置变量
DB_NAME="n8n"
DB_USER="n8n_user"
BACKUP_DIR="/backup/n8n"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="n8n_backup_${DATE}.sql"
# 创建备份目录
mkdir -p $BACKUP_DIR
# 执行备份
pg_dump -h localhost -U $DB_USER -d $DB_NAME > $BACKUP_DIR/$BACKUP_FILE
# 压缩备份文件
gzip $BACKUP_DIR/$BACKUP_FILE
# 删除7天前的备份
find $BACKUP_DIR -name "*.gz" -mtime +7 -delete
# 记录日志
echo "$(date): Backup completed - $BACKUP_FILE.gz" >> /var/log/n8n_backup.log
定时备份任务
bash
# 添加到 crontab
# 每天凌晨2点执行备份
0 2 * * * /path/to/backup_n8n.sh
# 每周日凌晨1点执行完整备份
0 1 * * 0 /path/to/full_backup_n8n.sh
监控与日志
生产环境的监控是保证系统稳定运行的关键。我们需要监控应用性能、资源使用情况和业务指标。
应用监控
Prometheus + Grafana 监控方案
首先配置 Prometheus 来收集 n8n 的指标:
yaml
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'n8n'
static_configs:
- targets: ['localhost:5678']
metrics_path: '/metrics'
scrape_interval: 30s
- job_name: 'postgres'
static_configs:
- targets: ['localhost:9187']
- job_name: 'redis'
static_configs:
- targets: ['localhost:9121']
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
自定义监控指标
javascript
// 在 n8n 中添加自定义指标
const prometheus = require('prom-client');
// 创建指标
const workflowExecutions = new prometheus.Counter({
name: 'n8n_workflow_executions_total',
help: 'Total number of workflow executions',
labelNames: ['workflow_name', 'status']
});
const executionDuration = new prometheus.Histogram({
name: 'n8n_workflow_execution_duration_seconds',
help: 'Duration of workflow executions',
labelNames: ['workflow_name'],
buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 300]
});
// 在工作流执行时记录指标
function recordWorkflowExecution(workflowName, status, duration) {
workflowExecutions.inc({ workflow_name: workflowName, status: status });
executionDuration.observe({ workflow_name: workflowName }, duration);
}
Grafana 仪表板配置
json
{
"dashboard": {
"title": "n8n 监控仪表板",
"panels": [
{
"title": "工作流执行次数",
"type": "stat",
"targets": [
{
"expr": "sum(rate(n8n_workflow_executions_total[5m]))",
"legendFormat": "执行/秒"
}
]
},
{
"title": "工作流成功率",
"type": "stat",
"targets": [
{
"expr": "sum(rate(n8n_workflow_executions_total{status=\"success\"}[5m])) / sum(rate(n8n_workflow_executions_total[5m])) * 100",
"legendFormat": "成功率 %"
}
]
},
{
"title": "平均执行时间",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(n8n_workflow_execution_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
}
]
}
}
日志管理
结构化日志配置
javascript
// 使用 Winston 进行日志管理
const winston = require('winston');
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: { service: 'n8n' },
transports: [
new winston.transports.File({
filename: '/var/log/n8n/error.log',
level: 'error'
}),
new winston.transports.File({
filename: '/var/log/n8n/combined.log'
}),
new winston.transports.Console({
format: winston.format.simple()
})
]
});
// 工作流执行日志
function logWorkflowExecution(workflowId, executionId, status, error = null) {
const logData = {
workflowId,
executionId,
status,
timestamp: new Date().toISOString()
};
if (error) {
logData.error = {
message: error.message,
stack: error.stack
};
logger.error('Workflow execution failed', logData);
} else {
logger.info('Workflow execution completed', logData);
}
}
ELK Stack 集成
yaml
# docker-compose.yml 中添加 ELK
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
logstash:
image: docker.elastic.co/logstash/logstash:7.14.0
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
kibana:
image: docker.elastic.co/kibana/kibana:7.14.0
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
depends_on:
- elasticsearch
filebeat:
image: docker.elastic.co/beats/filebeat:7.14.0
volumes:
- ./filebeat.yml:/usr/share/filebeat/filebeat.yml
- /var/log/n8n:/var/log/n8n:ro
depends_on:
- logstash
日志轮转配置
bash
# /etc/logrotate.d/n8n
/var/log/n8n/*.log {
daily
missingok
rotate 30
compress
delaycompress
notifempty
create 644 n8n n8n
postrotate
systemctl reload n8n
endscript
}
健康检查
应用健康检查端点
javascript
// 健康检查路由
app.get('/health', async (req, res) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
checks: {}
};
try {
// 数据库连接检查
await checkDatabaseConnection();
health.checks.database = { status: 'healthy' };
} catch (error) {
health.checks.database = {
status: 'unhealthy',
error: error.message
};
health.status = 'unhealthy';
}
try {
// Redis 连接检查
await checkRedisConnection();
health.checks.redis = { status: 'healthy' };
} catch (error) {
health.checks.redis = {
status: 'unhealthy',
error: error.message
};
health.status = 'unhealthy';
}
// 内存使用检查
const memUsage = process.memoryUsage();
const memUsageMB = Math.round(memUsage.heapUsed / 1024 / 1024);
health.checks.memory = {
status: memUsageMB < 1000 ? 'healthy' : 'warning',
usage: `${memUsageMB}MB`
};
const statusCode = health.status === 'healthy' ? 200 : 503;
res.status(statusCode).json(health);
});
Kubernetes 健康检查
yaml
# 在 Deployment 中添加健康检查
spec:
containers:
- name: n8n
image: n8nio/n8n:latest
livenessProbe:
httpGet:
path: /health
port: 5678
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 5678
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
备份与恢复
数据是企业的生命线,完善的备份和恢复策略能够在关键时刻挽救整个业务。
数据备份策略
分层备份方案
我们需要对不同类型的数据采用不同的备份策略:
bash
#!/bin/bash
# comprehensive_backup.sh - 综合备份脚本
# 配置变量
BACKUP_ROOT="/backup"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# 数据库备份
backup_database() {
echo "开始数据库备份..."
# 创建备份目录
mkdir -p $BACKUP_ROOT/database/$DATE
# 备份主数据库
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME \
--verbose --clean --no-owner --no-privileges \
> $BACKUP_ROOT/database/$DATE/n8n_database.sql
# 压缩备份文件
cd $BACKUP_ROOT/database/$DATE
tar -czf ../n8n_db_backup_$DATE.tar.gz .
cd - > /dev/null
echo "数据库备份完成"
}
# 工作流文件备份
backup_workflows() {
echo "开始工作流文件备份..."
tar -czf $BACKUP_ROOT/workflows/n8n_workflows_$DATE.tar.gz \
-C /home/node/.n8n \
--exclude='logs/*' \
--exclude='temp/*' \
.
echo "工作流文件备份完成"
}
# 执行备份
main() {
echo "开始 n8n 系统备份 - $(date)"
backup_database
backup_workflows
echo "备份完成 - $(date)"
}
main
灾难恢复方案
恢复流程文档
bash
#!/bin/bash
# disaster_recovery.sh - 灾难恢复脚本
# 数据库恢复
recover_database() {
echo "恢复数据库..."
# 停止 n8n 服务
docker-compose down
# 创建新数据库
createdb -h $DB_HOST -U postgres $DB_NAME
# 恢复数据
psql -h $DB_HOST -U $DB_USER -d $DB_NAME < $BACKUP_FILE
echo "数据库恢复完成"
}
# 验证恢复结果
verify_recovery() {
echo "验证恢复结果..."
# 检查数据库连接
if psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;" > /dev/null 2>&1; then
echo "✓ 数据库连接正常"
else
echo "✗ 数据库连接失败"
exit 1
fi
echo "恢复验证完成"
}
小结
生产环境的部署和运维是一个系统工程,需要考虑多个方面:
- 部署策略:选择合适的部署方式,确保高可用性
- 数据库配置:优化数据库性能,确保数据安全
- 监控体系:建立完善的监控和告警机制
- 备份恢复:制定完善的备份策略和灾难恢复方案
- 运维实践:建立标准化的运维流程
记住,运维工作需要持续改进。定期回顾和优化部署架构,根据业务增长调整资源配置,这样才能保证系统的长期稳定运行。
下一篇文章我们将学习团队协作与版本控制,了解如何在团队环境中高效使用 n8n。