Files
portfolio/docs/ai-image-generation/SETUP.md
2026-01-07 14:30:00 +01:00

486 lines
12 KiB
Markdown

# AI Image Generation Setup
This guide explains how to set up automatic AI-powered image generation for your portfolio projects using local AI models.
## Overview
The system automatically generates project cover images by:
1. Reading project metadata (title, description, tags, tech stack)
2. Creating an optimized prompt for image generation
3. Sending the prompt to a local AI image generator
4. Saving the generated image
5. Updating the project's `imageUrl` in the database
## Supported Local AI Tools
### Option 1: Stable Diffusion WebUI (AUTOMATIC1111) - Recommended
**Pros:**
- Most mature and widely used
- Excellent API support
- Large model ecosystem
- Easy to use
**Installation:**
```bash
# Clone the repository
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
# Install and run (will download models automatically)
./webui.sh --api --listen
```
**API Endpoint:** `http://localhost:7860`
**Recommended Models:**
- **SDXL Base 1.0** - High quality, versatile
- **Realistic Vision V5.1** - Photorealistic images
- **DreamShaper 8** - Artistic, tech-focused imagery
- **Juggernaut XL** - Modern, clean aesthetics
**Download Models:**
```bash
cd models/Stable-diffusion/
# SDXL Base (6.94 GB)
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
# Or use the WebUI's model downloader
```
### Option 2: ComfyUI
**Pros:**
- Node-based workflow system
- More control over generation pipeline
- Better for complex compositions
**Installation:**
```bash
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
python main.py --listen 0.0.0.0 --port 8188
```
**API Endpoint:** `http://localhost:8188`
### Option 3: Ollama + Stable Diffusion
**Pros:**
- Lightweight
- Easy model management
- Can combine with LLM for better prompts
**Installation:**
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Install a vision-capable model
ollama pull llava
# For image generation, you'll still need SD WebUI or ComfyUI
```
## n8n Workflow Setup
### 1. Install n8n (if not already installed)
```bash
# Docker Compose (recommended)
docker-compose up -d n8n
# Or npm
npm install -g n8n
n8n start
```
### 2. Import Workflow
1. Open n8n at `http://localhost:5678`
2. Go to **Workflows****Import from File**
3. Import `n8n-workflows/ai-image-generator.json`
### 3. Configure Workflow Nodes
#### Node 1: Webhook Trigger
- **Method:** POST
- **Path:** `ai-image-generation`
- **Authentication:** Header Auth (use secret token)
#### Node 2: Postgres - Get Project Data
```sql
SELECT id, title, description, tags, category, content
FROM projects
WHERE id = $json.projectId
LIMIT 1;
```
#### Node 3: Code - Build AI Prompt
```javascript
// Extract project data
const project = $input.first().json;
// Build sophisticated prompt
const styleKeywords = {
'web': 'modern web interface, clean UI, gradient backgrounds, glass morphism',
'mobile': 'mobile app mockup, sleek design, app icons, smartphone screen',
'devops': 'server infrastructure, network diagram, cloud architecture, terminal windows',
'game': 'game scene, 3D environment, gaming interface, player HUD',
'ai': 'neural network visualization, AI chip, data flow, futuristic tech',
'automation': 'workflow diagram, automated processes, gears and circuits'
};
const categoryStyle = styleKeywords[project.category?.toLowerCase()] || 'technology concept';
const prompt = `
Professional tech project cover image, ${categoryStyle},
representing "${project.title}",
modern design, vibrant colors, high quality,
isometric view, minimalist, clean composition,
4k resolution, trending on artstation,
color palette: blue, purple, teal accents,
no text, no people, no logos
`.trim().replace(/\s+/g, ' ');
const negativePrompt = `
low quality, blurry, pixelated, text, watermark,
signature, logo, people, faces, hands,
cluttered, messy, dark, gloomy
`.trim().replace(/\s+/g, ' ');
return {
json: {
projectId: project.id,
prompt: prompt,
negativePrompt: negativePrompt,
title: project.title,
category: project.category
}
};
```
#### Node 4: HTTP Request - Generate Image (Stable Diffusion)
- **Method:** POST
- **URL:** `http://your-sd-server:7860/sdapi/v1/txt2img`
- **Body:**
```json
{
"prompt": "={{ $json.prompt }}",
"negative_prompt": "={{ $json.negativePrompt }}",
"steps": 30,
"cfg_scale": 7,
"width": 1024,
"height": 768,
"sampler_name": "DPM++ 2M Karras",
"seed": -1,
"batch_size": 1,
"n_iter": 1
}
```
#### Node 5: Code - Save Image to File
```javascript
const fs = require('fs');
const path = require('path');
const imageData = $input.first().json.images[0]; // Base64 image
const projectId = $json.projectId;
const timestamp = Date.now();
// Create directory if doesn't exist
const uploadDir = '/app/public/generated-images';
if (!fs.existsSync(uploadDir)) {
fs.mkdirSync(uploadDir, { recursive: true });
}
// Save image
const filename = `project-${projectId}-${timestamp}.png`;
const filepath = path.join(uploadDir, filename);
fs.writeFileSync(filepath, Buffer.from(imageData, 'base64'));
return {
json: {
projectId: projectId,
imageUrl: `/generated-images/${filename}`,
filepath: filepath
}
};
```
#### Node 6: Postgres - Update Project
```sql
UPDATE projects
SET image_url = $json.imageUrl,
updated_at = NOW()
WHERE id = $json.projectId;
```
#### Node 7: Webhook Response
```json
{
"success": true,
"projectId": "={{ $json.projectId }}",
"imageUrl": "={{ $json.imageUrl }}",
"message": "Image generated successfully"
}
```
## API Integration
### Generate Image for Project
**Endpoint:** `POST /api/n8n/generate-image`
**Request:**
```json
{
"projectId": 123,
"regenerate": false
}
```
**Response:**
```json
{
"success": true,
"projectId": 123,
"imageUrl": "/generated-images/project-123-1234567890.png",
"generatedAt": "2024-01-15T10:30:00Z"
}
```
### Automatic Generation on Project Creation
Add this to your project creation API:
```typescript
// After creating project in database
if (process.env.AUTO_GENERATE_IMAGES === 'true') {
await fetch(`${process.env.N8N_WEBHOOK_URL}/ai-image-generation`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.N8N_SECRET_TOKEN}`
},
body: JSON.stringify({
projectId: newProject.id
})
});
}
```
## Environment Variables
Add to `.env.local`:
```bash
# AI Image Generation
N8N_WEBHOOK_URL=http://localhost:5678/webhook
N8N_SECRET_TOKEN=your-secure-token-here
AUTO_GENERATE_IMAGES=true
# Stable Diffusion API
SD_API_URL=http://localhost:7860
SD_API_KEY=optional-if-protected
# Image Storage
GENERATED_IMAGES_DIR=/app/public/generated-images
```
## Prompt Engineering Tips
### Good Prompts for Tech Projects
**Web Application:**
```
modern web dashboard interface, clean UI design, gradient background,
glass morphism, floating panels, data visualization, charts and graphs,
vibrant blue and purple color scheme, isometric view, 4k quality
```
**Mobile App:**
```
sleek mobile app interface mockup, smartphone screen, modern app design,
minimalist UI, smooth gradients, app icons, notification badges,
floating elements, teal and pink accents, professional photography
```
**DevOps/Infrastructure:**
```
cloud infrastructure diagram, server network visualization,
interconnected nodes, data flow arrows, container icons,
modern tech illustration, isometric perspective, cyan and orange colors
```
**AI/ML Project:**
```
artificial intelligence concept, neural network visualization,
glowing nodes and connections, data streams, futuristic interface,
holographic elements, purple and blue neon lighting, high tech
```
### Negative Prompts (What to Avoid)
```
text, watermark, signature, logo, brand name, letters, numbers,
people, faces, hands, fingers, human figures,
low quality, blurry, pixelated, jpeg artifacts,
dark, gloomy, depressing, messy, cluttered,
realistic photo, stock photo
```
## Image Specifications
**Recommended Settings:**
- **Resolution:** 1024x768 (4:3 aspect ratio for cards)
- **Format:** PNG (with transparency support)
- **Size:** < 500KB (optimize after generation)
- **Color Profile:** sRGB
- **Sampling Steps:** 25-35 (balance quality vs speed)
- **CFG Scale:** 6-8 (how closely to follow prompt)
## Optimization
### Post-Processing Pipeline
```bash
# Install image optimization tools
npm install sharp tinypng-cli
# Optimize generated images
sharp input.png -o optimized.png --webp --quality 85
# Or use TinyPNG
tinypng input.png --key YOUR_API_KEY
```
### Caching Strategy
```typescript
// Cache generated images in Redis
await redis.set(
`project:${projectId}:image`,
imageUrl,
'EX',
60 * 60 * 24 * 30 // 30 days
);
```
## Monitoring & Debugging
### Check Stable Diffusion Status
```bash
curl http://localhost:7860/sdapi/v1/sd-models
```
### View n8n Execution Logs
1. Open n8n UI Executions
2. Filter by workflow "AI Image Generator"
3. Check error logs and execution time
### Test Image Generation
```bash
curl -X POST http://localhost:7860/sdapi/v1/txt2img \
-H "Content-Type: application/json" \
-d '{
"prompt": "modern tech interface, blue gradient",
"steps": 20,
"width": 512,
"height": 512
}'
```
## Troubleshooting
### "CUDA out of memory"
- Reduce image resolution (768x576 instead of 1024x768)
- Lower batch size to 1
- Use `--lowvram` or `--medvram` flags when starting SD
### "Connection refused to SD API"
- Check if SD WebUI is running: `ps aux | grep webui`
- Verify API is enabled: `--api` flag in startup
- Check firewall: `sudo ufw allow 7860`
### "Poor image quality"
- Increase sampling steps (30-40)
- Try different samplers (Euler a, DPM++ 2M Karras)
- Adjust CFG scale (7-9)
- Use better checkpoint model (SDXL, Realistic Vision)
### "Images don't match project theme"
- Refine prompts with more specific keywords
- Use category-specific style templates
- Add technical keywords from project tags
- Experiment with different negative prompts
## Advanced: Multi-Model Strategy
Use different models for different project types:
```javascript
const modelMap = {
'web': 'dreamshaper_8.safetensors',
'mobile': 'realisticVision_v51.safetensors',
'devops': 'juggernautXL_v8.safetensors',
'ai': 'sdxl_base_1.0.safetensors'
};
// Switch model before generation
await fetch('http://localhost:7860/sdapi/v1/options', {
method: 'POST',
body: JSON.stringify({
sd_model_checkpoint: modelMap[project.category]
})
});
```
## Security Considerations
1. **Isolate SD WebUI:** Run in Docker container, not exposed to internet
2. **Authentication:** Protect n8n webhooks with tokens
3. **Rate Limiting:** Limit image generation requests
4. **Content Filtering:** Validate prompts to prevent abuse
5. **Resource Limits:** Set GPU memory limits in Docker
## Cost & Performance
**Hardware Requirements:**
- **Minimum:** 8GB RAM, GTX 1060 6GB
- **Recommended:** 16GB RAM, RTX 3060 12GB
- **Optimal:** 32GB RAM, RTX 4090 24GB
**Generation Time:**
- **512x512:** ~5-10 seconds
- **1024x768:** ~15-30 seconds
- **1024x1024 (SDXL):** ~30-60 seconds
**Storage:**
- ~500KB per optimized image
- ~50MB for 100 projects
## Future Enhancements
- [ ] Style transfer from existing brand assets
- [ ] A/B testing different image variants
- [ ] User feedback loop for prompt refinement
- [ ] Batch generation for multiple projects
- [ ] Integration with DALL-E 3 / Midjourney as fallback
- [ ] Automatic alt text generation for accessibility
- [ ] Version history for generated images
---
**Next Steps:**
1. Set up Stable Diffusion WebUI locally
2. Import n8n workflow
3. Test with sample project
4. Refine prompts based on results
5. Enable auto-generation for new projects