feat: 更新例句圖生成PRD為兩階段架構設計

- 重新設計為 Gemini 描述生成 + Replicate 圖片生成的兩階段流程
- 更新資料庫設計支援兩階段狀態追蹤和成本記錄
- 修改API設計規範包含中間狀態處理和進度回報
- 新增詳細的技術實現:GeminiImageDescriptionService + ReplicateImageGenerationService
- 調整成本控制策略:階段性積分扣款和智能快取匹配
- 更新開發里程碑:反映兩階段實現的複雜性,總時程10-14週

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
鄭沛軒 2025-09-24 18:38:38 +08:00
parent fa0e74381b
commit 502e7f920b
1 changed files with 627 additions and 110 deletions

View File

@ -38,21 +38,48 @@
└── 彈性擴展與監控 # 效能管理
```
### 2.2 核心系統元件
### 2.2 兩階段 AI 圖片生成架構
#### 2.2.1 圖片生成引擎
- **AI 圖片生成服務** (DALL-E 3 / Midjourney API)
- **圖片品質檢測**:自動過濾不當或低品質內容
#### 2.2.1 第一階段:圖片描述生成 (Gemini AI)
```
詞卡內容 → Gemini API → 優化的圖片描述 prompt
↓ ↓ ↓
詞彙+例句 智能分析 詳細視覺描述
語境資訊 風格調整 生成參數優化
```
- **Gemini 提示詞生成**:基於現有 GeminiAIProvider 服務
- **語境分析**:結合詞彙難度、學習者程度、例句內容
- **風格優化**:根據 CEFR 等級調整描述複雜度和視覺風格
- **成本效益**:重用現有 Gemini 整合,每次調用約 $0.001
#### 2.2.2 第二階段:圖片生成 (Replicate API)
```
圖片描述 prompt → Replicate API → 高品質例句圖片
↓ ↓ ↓
優化提示詞 圖片生成模型 品質檢測
參數配置 (FLUX/SD XL) 內容審核
```
- **Replicate 圖片生成**:使用 FLUX 或 Stable Diffusion XL 模型
- **多模型支援**:支援不同風格和品質需求
- **批量處理佇列**:非同步處理大量生成請求
- **生成參數優化**:根據學習等級調整圖片風格
- **品質檢測**:自動過濾不當或低品質內容
#### 2.2.2 智能快取系統
- **語意快取**:類似例句共享相同圖片
#### 2.2.3 圖片生成引擎整合
- **流程編排**:協調兩階段生成流程
- **錯誤處理**:單階段失敗時的重試和降級策略
- **狀態管理**:實時追蹤生成進度和狀態更新
- **成本優化**:智能調度和資源管理
#### 2.2.4 智能快取系統
- **雙階段快取**Gemini 描述快取 + Replicate 圖片快取
- **語意快取**:類似例句共享相同圖片描述和最終圖片
- **多層快取策略**:記憶體 → 資料庫 → 檔案系統
- **快取失效機制**:基於使用頻率和時間的清理
- **預生成策略**:熱門詞彙預先生成圖片
- **預生成策略**:熱門詞彙預先生成描述和圖片
#### 2.2.3 儲存抽象層
#### 2.2.5 儲存抽象層
```csharp
public interface IImageStorageService
{
@ -78,8 +105,20 @@ CREATE TABLE example_images (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
relative_path VARCHAR(500) NOT NULL, -- 圖片相對路徑
alt_text VARCHAR(200), -- 圖片描述文字
generation_prompt TEXT, -- AI 生成時的提示詞
generation_provider VARCHAR(50), -- 生成服務供應商
-- 兩階段生成相關欄位
gemini_prompt TEXT, -- Gemini 生成描述的原始提示詞
gemini_description TEXT, -- Gemini 生成的圖片描述
replicate_prompt TEXT, -- 最終傳給 Replicate 的優化提示詞
replicate_model VARCHAR(100), -- 使用的 Replicate 模型名稱
replicate_version VARCHAR(100), -- 模型版本號
-- 生成成本追蹤
gemini_cost DECIMAL(10,6), -- Gemini API 成本
replicate_cost DECIMAL(10,6), -- Replicate API 成本
total_generation_cost DECIMAL(10,6), -- 總生成成本
-- 原有欄位
file_size INTEGER, -- 檔案大小 (bytes)
image_width INTEGER, -- 圖片寬度
image_height INTEGER, -- 圖片高度
@ -112,13 +151,39 @@ CREATE TABLE image_generation_requests (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES user_profiles(id) ON DELETE CASCADE,
flashcard_id UUID REFERENCES flashcards(id) ON DELETE CASCADE,
request_prompt TEXT NOT NULL, -- 生成請求提示詞
generation_status VARCHAR(20) DEFAULT 'pending', -- 狀態: pending/processing/completed/failed
-- 兩階段狀態追蹤
overall_status VARCHAR(20) DEFAULT 'pending', -- 總狀態: pending/description_generating/image_generating/completed/failed
gemini_status VARCHAR(20) DEFAULT 'pending', -- Gemini 階段: pending/processing/completed/failed
replicate_status VARCHAR(20) DEFAULT 'pending', -- Replicate 階段: pending/processing/completed/failed
-- 請求內容
original_request TEXT NOT NULL, -- 原始請求內容 (詞卡資訊)
gemini_prompt TEXT, -- 傳給 Gemini 的提示詞
generated_description TEXT, -- Gemini 生成的圖片描述
final_replicate_prompt TEXT, -- 最終傳給 Replicate 的提示詞
-- 結果和錯誤
generated_image_id UUID REFERENCES example_images(id),
error_message TEXT, -- 錯誤訊息
cost_credits DECIMAL(10,4), -- 消耗的積分成本
processing_time_ms INTEGER, -- 處理時間 (毫秒)
gemini_error_message TEXT, -- Gemini 階段錯誤訊息
replicate_error_message TEXT, -- Replicate 階段錯誤訊息
-- 效能追蹤
gemini_processing_time_ms INTEGER, -- Gemini 處理時間
replicate_processing_time_ms INTEGER, -- Replicate 處理時間
total_processing_time_ms INTEGER, -- 總處理時間
-- 成本追蹤
gemini_cost DECIMAL(10,6), -- Gemini API 成本
replicate_cost DECIMAL(10,6), -- Replicate API 成本
total_cost DECIMAL(10,6), -- 總成本
-- 時間戳記
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
gemini_started_at TIMESTAMP,
gemini_completed_at TIMESTAMP,
replicate_started_at TIMESTAMP,
replicate_completed_at TIMESTAMP,
completed_at TIMESTAMP
);
```
@ -136,7 +201,7 @@ CREATE INDEX idx_generation_requests_status ON image_generation_requests(generat
### 4.1 核心 API 端點
#### 4.1.1 生成例句圖片
#### 4.1.1 生成例句圖片 (兩階段流程)
```http
POST /api/flashcards/{flashcardId}/generate-example-image
Authorization: Bearer {token}
@ -144,17 +209,22 @@ Content-Type: application/json
Request Body:
{
"prompt": "A business meeting scene where someone is bringing up a topic",
"style": "cartoon|realistic|minimal", // 圖片風格
"style": "cartoon|realistic|minimal", // 圖片風格偏好
"priority": "normal|high|low", // 生成優先級
"dimensions": {
"dimensions": { // 圖片尺寸要求
"width": 512,
"height": 512
},
"generationOptions": { // 生成選項
"useGeminiCache": true, // 是否使用 Gemini 描述快取
"useImageCache": true, // 是否使用圖片快取
"maxRetries": 3 // 最大重試次數
},
"replicateModel": "flux-1-dev|stable-diffusion-xl", // Replicate 模型選擇
"additionalContext": { // 額外語境資訊
"difficultyLevel": "B1",
"scenario": "business",
"learnerPreferences": ["visual", "colorful"]
"learnerLevel": "B1",
"scenario": "business|daily|academic",
"visualPreferences": ["colorful", "simple", "realistic"]
}
}
@ -163,34 +233,111 @@ Response:
"success": true,
"data": {
"requestId": "uuid",
"status": "processing|completed",
"estimatedTimeMinutes": 2,
"imageUrl": "https://cdn.dramaling.com/images/examples/uuid.png", // 完成後才有
"costCredits": 1.5,
"overallStatus": "pending", // pending/description_generating/image_generating/completed/failed
"currentStage": "description_generation", // 當前執行階段
"estimatedTimeMinutes": {
"gemini": 0.5, // Gemini 描述生成預估時間
"replicate": 2, // Replicate 圖片生成預估時間
"total": 2.5
},
"costEstimate": {
"gemini": 0.001, // Gemini 成本預估
"replicate": 0.05, // Replicate 成本預估
"total": 0.051
},
"queuePosition": 3 // 佇列中的位置
}
}
```
#### 4.1.2 獲取圖片生成狀態
#### 4.1.2 獲取圖片生成狀態 (兩階段狀態追蹤)
```http
GET /api/image-generation/requests/{requestId}/status
Authorization: Bearer {token}
Response:
Response (進行中):
{
"success": true,
"data": {
"status": "completed",
"imageUrl": "https://cdn.dramaling.com/images/examples/uuid.png",
"qualityScore": 0.95,
"alternativeImages": [ // 同時生成的其他選項
{
"imageUrl": "https://cdn.dramaling.com/images/examples/uuid-alt1.png",
"qualityScore": 0.87
"requestId": "uuid",
"overallStatus": "image_generating",
"stages": {
"gemini": {
"status": "completed",
"startedAt": "2025-09-24T10:28:00Z",
"completedAt": "2025-09-24T10:28:15Z",
"processingTimeMs": 15000,
"cost": 0.0012,
"generatedDescription": "A professional business meeting scene with diverse people sitting around a modern conference table. One person is gesturing while presenting an idea, with other colleagues listening attentively..."
},
"replicate": {
"status": "processing",
"startedAt": "2025-09-24T10:28:20Z",
"model": "flux-1-dev",
"estimatedCompletionTime": "2025-09-24T10:30:30Z",
"progress": "65%"
}
],
"completedAt": "2025-09-24T10:30:00Z"
},
"totalProcessingTimeMs": 95000,
"estimatedRemainingTimeMs": 45000
}
}
Response (完成):
{
"success": true,
"data": {
"requestId": "uuid",
"overallStatus": "completed",
"stages": {
"gemini": {
"status": "completed",
"processingTimeMs": 15000,
"cost": 0.0012,
"generatedDescription": "A professional business meeting scene..."
},
"replicate": {
"status": "completed",
"processingTimeMs": 125000,
"cost": 0.048,
"model": "flux-1-dev",
"modelVersion": "dev-v1.2"
}
},
"result": {
"imageUrl": "https://cdn.dramaling.com/images/examples/uuid.png",
"imageId": "uuid",
"qualityScore": 0.95,
"dimensions": { "width": 512, "height": 512 },
"fileSize": 245760
},
"totalCost": 0.0492,
"totalProcessingTimeMs": 140000,
"completedAt": "2025-09-24T10:30:25Z"
}
}
Response (失敗):
{
"success": true,
"data": {
"requestId": "uuid",
"overallStatus": "failed",
"failedStage": "replicate",
"stages": {
"gemini": {
"status": "completed",
"cost": 0.0012
},
"replicate": {
"status": "failed",
"error": "Content policy violation: Generated image contains inappropriate content",
"retryCount": 3
}
},
"totalCost": 0.0012,
"canRetry": true,
"suggestedAction": "modify_prompt"
}
}
```
@ -301,32 +448,271 @@ Response:
## 6. 技術實現細節
### 6.1 AI 圖片生成整合
### 6.1 兩階段 AI 圖片生成整合
#### 6.1.1 提示詞優化策略
#### 6.1.1 第一階段Gemini 圖片描述生成服務
```csharp
public class PromptBuilder
public class GeminiImageDescriptionService : IGeminiImageDescriptionService
{
public string BuildPrompt(Flashcard flashcard)
{
var basePrompt = $"Create an educational illustration for the English word '{flashcard.Word}'";
var context = $"Context: {flashcard.Example}";
var style = GetStyleForDifficulty(flashcard.DifficultyLevel);
var constraints = "Style: clean, educational, appropriate for language learning";
private readonly GeminiAIProvider _geminiProvider;
private readonly ILogger<GeminiImageDescriptionService> _logger;
return $"{basePrompt}. {context}. {style}. {constraints}";
public async Task<ImageDescriptionResult> GenerateDescriptionAsync(Flashcard flashcard, GenerationOptions options)
{
var prompt = BuildGeminiPrompt(flashcard, options);
try
{
var geminiResponse = await _geminiProvider.CallGeminiAPIAsync(prompt);
var description = ExtractImageDescription(geminiResponse);
return new ImageDescriptionResult
{
Success = true,
Description = description,
OptimizedPrompt = OptimizeForReplicate(description, options),
Cost = CalculateGeminiCost(prompt),
ProcessingTimeMs = stopwatch.ElapsedMilliseconds
};
}
catch (Exception ex)
{
_logger.LogError(ex, "Gemini description generation failed for flashcard {FlashcardId}", flashcard.Id);
return new ImageDescriptionResult { Success = false, Error = ex.Message };
}
}
private string GetStyleForDifficulty(string level)
private string BuildGeminiPrompt(Flashcard flashcard, GenerationOptions options)
{
return level switch
return $@"You are a visual content creator for English language learning. Generate a detailed image description prompt for the word ""{flashcard.Word}"".
**Context Information:**
- Word: {flashcard.Word}
- Translation: {flashcard.Translation}
- Example: {flashcard.Example}
- Part of Speech: {flashcard.PartOfSpeech}
- Difficulty Level: {flashcard.DifficultyLevel}
- Learner Preference Style: {options.Style}
**Requirements:**
1. Create a vivid, educational scene that clearly illustrates the word's meaning
2. Include contextual elements that help with vocabulary memorization
3. Style should be {GetStyleForDifficulty(flashcard.DifficultyLevel)}
4. Ensure the scene is culturally appropriate and educational
5. Focus on visual clarity and learning effectiveness
**Return Format:**
Return ONLY the image description prompt, no additional text or explanation.
**Example Output:**
""A professional business meeting scene with 5-6 diverse people sitting around a modern conference table. One person is standing and gesturing while presenting an idea to colleagues. The scene shows engaged listeners with some taking notes. Modern office setting with large windows showing city view. Clean, professional illustration style with good lighting and clear details.""";
}
private string OptimizeForReplicate(string description, GenerationOptions options)
{
// 針對 Replicate 模型優化描述
var optimizedPrompt = description;
// 添加風格增強詞
switch (options.Style?.ToLower())
{
"A1" or "A2" => "Simple cartoon style with bright colors",
"B1" or "B2" => "Semi-realistic style with clear details",
"C1" or "C2" => "Realistic style with nuanced visual elements",
_ => "Balanced educational illustration"
case "cartoon":
optimizedPrompt += ", cartoon style, bright colors, clean lines, educational illustration";
break;
case "realistic":
optimizedPrompt += ", photorealistic, high quality, detailed, professional photography style";
break;
case "minimal":
optimizedPrompt += ", minimalist style, clean design, simple composition, clear focus";
break;
}
// 添加品質增強詞
optimizedPrompt += ", high quality, well-composed, educational, appropriate for language learning";
return optimizedPrompt;
}
}
```
#### 6.1.2 第二階段Replicate 圖片生成服務
```csharp
public class ReplicateImageGenerationService : IReplicateImageGenerationService
{
private readonly HttpClient _httpClient;
private readonly ReplicateOptions _options;
private readonly ILogger<ReplicateImageGenerationService> _logger;
public async Task<ImageGenerationResult> GenerateImageAsync(string prompt, ReplicateModel model, GenerationOptions options)
{
var requestPayload = BuildReplicateRequest(prompt, model, options);
try
{
// 啟動 Replicate 預測
var predictionResponse = await StartPredictionAsync(requestPayload);
// 輪詢檢查生成狀態
var result = await WaitForCompletionAsync(predictionResponse.Id, options.TimeoutMinutes);
return result;
}
catch (Exception ex)
{
_logger.LogError(ex, "Replicate image generation failed");
return new ImageGenerationResult { Success = false, Error = ex.Message };
}
}
private object BuildReplicateRequest(string prompt, ReplicateModel model, GenerationOptions options)
{
return model.Name switch
{
"flux-1-dev" => new
{
input = new
{
prompt = prompt,
width = options.Width ?? 512,
height = options.Height ?? 512,
num_outputs = 1,
guidance_scale = 3.5,
num_inference_steps = 28,
seed = options.Seed ?? Random.Shared.Next()
}
},
"stable-diffusion-xl" => new
{
input = new
{
prompt = prompt,
width = options.Width ?? 512,
height = options.Height ?? 512,
num_outputs = 1,
scheduler = "K_EULER_ANCESTRAL",
num_inference_steps = 25,
guidance_scale = 7.5,
seed = options.Seed ?? Random.Shared.Next()
}
},
_ => throw new NotSupportedException($"Model {model.Name} not supported")
};
}
private async Task<ImageGenerationResult> WaitForCompletionAsync(string predictionId, int timeoutMinutes)
{
var timeout = TimeSpan.FromMinutes(timeoutMinutes);
var pollInterval = TimeSpan.FromSeconds(2);
var startTime = DateTime.UtcNow;
while (DateTime.UtcNow - startTime < timeout)
{
var status = await GetPredictionStatusAsync(predictionId);
switch (status.Status)
{
case "succeeded":
return new ImageGenerationResult
{
Success = true,
ImageUrl = status.Output?.FirstOrDefault()?.ToString(),
ProcessingTimeMs = (int)(DateTime.UtcNow - startTime).TotalMilliseconds,
Cost = CalculateReplicateCost(status.Metrics),
ModelVersion = status.Version
};
case "failed":
return new ImageGenerationResult
{
Success = false,
Error = status.Error?.ToString() ?? "Generation failed",
ProcessingTimeMs = (int)(DateTime.UtcNow - startTime).TotalMilliseconds
};
case "processing":
await Task.Delay(pollInterval);
break;
}
}
return new ImageGenerationResult
{
Success = false,
Error = "Generation timeout exceeded"
};
}
}
```
#### 6.1.3 兩階段流程編排服務
```csharp
public class ImageGenerationOrchestrator : IImageGenerationOrchestrator
{
private readonly IGeminiImageDescriptionService _geminiService;
private readonly IReplicateImageGenerationService _replicateService;
private readonly IImageGenerationRepository _repository;
public async Task<GenerationRequestResult> StartGenerationAsync(Guid flashcardId, GenerationRequest request)
{
var generationRequest = await _repository.CreateRequestAsync(flashcardId, request);
// 後台執行兩階段生成
_ = Task.Run(() => ExecuteGenerationPipelineAsync(generationRequest));
return new GenerationRequestResult
{
RequestId = generationRequest.Id,
Status = "pending",
EstimatedTimeMinutes = 3
};
}
private async Task ExecuteGenerationPipelineAsync(ImageGenerationRequest request)
{
try
{
// 第一階段:生成圖片描述
await _repository.UpdateStatusAsync(request.Id, "description_generating");
var descriptionResult = await _geminiService.GenerateDescriptionAsync(
request.Flashcard,
request.Options
);
if (!descriptionResult.Success)
{
await _repository.MarkAsFailedAsync(request.Id, "gemini", descriptionResult.Error);
return;
}
await _repository.UpdateGeminiResultAsync(request.Id, descriptionResult);
// 第二階段:生成圖片
await _repository.UpdateStatusAsync(request.Id, "image_generating");
var imageResult = await _replicateService.GenerateImageAsync(
descriptionResult.OptimizedPrompt,
request.Options.ReplicateModel,
request.Options
);
if (!imageResult.Success)
{
await _repository.MarkAsFailedAsync(request.Id, "replicate", imageResult.Error);
return;
}
// 儲存最終結果
var savedImage = await SaveGeneratedImageAsync(imageResult);
await _repository.CompleteRequestAsync(request.Id, savedImage.Id);
}
catch (Exception ex)
{
_logger.LogError(ex, "Generation pipeline failed for request {RequestId}", request.Id);
await _repository.MarkAsFailedAsync(request.Id, "system", ex.Message);
}
}
}
```
@ -405,34 +791,139 @@ public class ImageStorageFactory
## 7. 成本控制與優化策略
### 7.1 智能成本管理
### 7.1 兩階段成本結構管理
#### 7.1.1 積分系統設計
- **新用戶**:免費 10 張圖片生成額度
- **基礎用戶**:每月 50 張額度
- **進階用戶**:每月 200 張額度
- **企業用戶**:無限制
#### 7.1.2 生成優化策略
- **語意去重**:檢測類似例句,共享圖片資源
- **批量處理**:合併相似請求,降低 API 呼叫成本
- **離峰生成**:非繁忙時段優先處理,降低服務成本
- **品質預篩**:提示詞優化,提高首次生成成功率
### 7.2 快取策略優化
#### 7.2.1 多層快取架構
#### 7.1.1 詳細成本分析
```
L1: 記憶體快取 (1小時) → 最熱門圖片
L2: Redis 快取 (24小時) → 常用圖片
L3: 資料庫快取 (永久) → 所有已生成圖片
L4: CDN 快取 (30天) → 公共訪問圖片
單次完整生成成本結構:
├── Gemini 描述生成: $0.001 - $0.003
│ ├── 基於輸入 token 數 (~500-1000 tokens)
│ ├── 輸出 token 數 (~200-400 tokens)
│ └── Gemini 1.5 Flash 定價
└── Replicate 圖片生成: $0.04 - $0.08
├── FLUX-1-dev: ~$0.05/張
├── Stable Diffusion XL: ~$0.04/張
└── 基於生成時間和運算資源
總成本範圍: $0.041 - $0.083 per 圖片
```
#### 7.2.2 預生成策略
- **熱門詞彙**:基於學習統計,預先生成高頻詞彙圖片
- **新詞預測**AI 預測可能成為熱門的新詞彙
- **季節性內容**:節慶、時事相關詞彙提前準備
#### 7.1.2 積分系統重新設計
```
積分消耗策略 (基於實際成本):
├── Gemini 階段: 0.1 積分 (約 $0.002)
├── Replicate 階段:
│ ├── FLUX-1-dev: 5 積分 (約 $0.05)
│ ├── Stable Diffusion XL: 4 積分 (約 $0.04)
│ └── 失敗不扣 Replicate 積分
└── 總成本: 4.1 - 5.1 積分/張圖片
用戶等級積分分配:
├── 新用戶: 50 積分 (約 10 張圖片)
├── 基礎用戶: 250 積分/月 (約 50 張圖片)
├── 進階用戶: 1000 積分/月 (約 200 張圖片)
└── 企業用戶: 無限制
```
#### 7.1.3 智能成本優化策略
**1. 階段性成本控制**
```csharp
public class CostOptimizationStrategy
{
public async Task<bool> ShouldProceedToReplicate(DescriptionResult result, UserQuota quota)
{
// 檢查用戶剩餘積分是否足夠完成 Replicate 階段
var replicateCost = CalculateReplicateCost(result.Options);
if (quota.RemainingCredits < replicateCost)
{
// Gemini 階段已完成,保存描述供後續使用
await SaveDescriptionForLater(result);
return false;
}
return true;
}
}
```
**2. 語意去重和共享**
- **Gemini 描述快取**:相似詞卡共享描述生成結果
- **最終圖片共享**:完全相同的優化提示詞重用生成結果
- **部分重用策略**:描述相似度 ≥ 85% 時提示用戶選擇重用
**3. 批量和預生成**
- **批量 Gemini 調用**:單次請求處理多個詞卡描述
- **預生成熱門詞彙**:基於學習統計預先生成高頻詞彙
- **離峰生成**:成本較低時段優先處理非急迫請求
### 7.2 兩階段快取策略優化
#### 7.2.1 雙快取架構系統
```
Gemini 描述快取層:
├── L1: 記憶體快取 (30分鐘) → 最近生成的描述
├── L2: Redis 快取 (24小時) → 常用詞彙描述
└── L3: 資料庫快取 (永久) → 所有生成的描述
最終圖片快取層:
├── L1: 記憶體快取 (1小時) → 最熱門圖片 URL
├── L2: Redis 快取 (24小時) → 常用圖片 metadata
├── L3: 資料庫快取 (永久) → 所有已生成圖片記錄
└── L4: CDN/儲存快取 (30天) → 實際圖片檔案
```
#### 7.2.2 智能快取匹配策略
```csharp
public class TwoStageCache
{
// Gemini 描述快取匹配
public async Task<string> GetCachedDescriptionAsync(Flashcard flashcard, GenerationOptions options)
{
// 1. 完全匹配:相同詞卡+選項
var exactMatch = await _cache.GetAsync($"desc:{flashcard.Id}:{options.GetHashCode()}");
if (exactMatch != null) return exactMatch;
// 2. 語意匹配:相似例句和語境
var semanticMatches = await FindSemanticMatches(flashcard.Example, 0.85);
if (semanticMatches.Any())
{
return await SelectBestMatch(semanticMatches, options);
}
// 3. 基礎匹配:同詞不同例句
var wordMatches = await _cache.GetAsync($"desc:word:{flashcard.Word}");
return wordMatches;
}
// Replicate 圖片快取匹配
public async Task<string> GetCachedImageAsync(string optimizedPrompt)
{
// 1. 完全匹配:相同的優化提示詞
var promptHash = ComputeHash(optimizedPrompt);
var exactImage = await _cache.GetAsync($"img:{promptHash}");
if (exactImage != null) return exactImage;
// 2. 相似匹配:相似提示詞 (相似度 ≥ 90%)
var similarPrompts = await FindSimilarPrompts(optimizedPrompt, 0.9);
if (similarPrompts.Any())
{
return await SelectBestImageMatch(similarPrompts);
}
return null;
}
}
```
#### 7.2.3 預生成和預快取策略
- **熱門詞彙預生成**:基於學習統計,預先完成兩階段生成
- **描述預生成**:新詞彙預先生成 Gemini 描述,圖片按需生成
- **季節性內容**:節慶、時事相關詞彙的描述和圖片提前準備
- **學習路徑預測**:根據用戶學習進度預生成即將學習的詞彙圖片
## 8. 監控與分析指標
@ -472,54 +963,80 @@ alerts:
## 9. 開發里程碑與排程
### 9.1 Phase 1: 核心功能開發 (4-6 週)
### 9.1 Phase 1: 兩階段核心功能開發 (5-7 週)
#### Week 1-2: 基礎架構
- [ ] 資料庫 schema 設計與建立
#### Week 1-2: 兩階段架構基礎
- [ ] 擴展資料庫 schema (支援兩階段追蹤)
- [ ] 實現 `GeminiImageDescriptionService`
- [ ] 開發 `ReplicateImageGenerationService`
- [ ] 建立 `ImageGenerationOrchestrator` 流程編排
- [ ] 儲存抽象層實現 (本地 + 雲端)
- [ ] 基礎 API 端點開發
- [ ] AI 圖片生成服務整合
#### Week 3-4: 前端整合
- [ ] 詞卡頁面新增圖片生成功能
- [ ] 載入狀態與動畫實現
- [ ] 錯誤處理與用戶回饋機制
#### Week 3-4: API 與後端服務
- [ ] 兩階段生成 API 端點開發
- [ ] 狀態追蹤與進度回報機制
- [ ] 錯誤處理與重試策略
- [ ] Replicate API 整合與輪詢機制
- [ ] 基於現有 Gemini 服務的描述生成
#### Week 5-6: 前端整合與用戶體驗
- [ ] 詞卡頁面新增兩階段生成功能
- [ ] 分階段載入狀態與進度顯示
- [ ] 實時狀態更新 (WebSocket/長輪詢)
- [ ] 兩階段錯誤處理與用戶回饋
- [ ] 響應式設計適配
#### Week 5-6: 優化與測試
- [ ] 快取機制實現
#### Week 7: 快取與優化
- [ ] 兩階段快取機制實現
- [ ] Gemini 描述語意匹配
- [ ] Replicate 圖片去重機制
- [ ] 批量處理佇列開發
- [ ] 單元測試與整合測試
- [ ] 效能測試與優化
- [ ] 成本控制策略實現
### 9.2 Phase 2: 進階功能 (3-4 週)
### 9.2 Phase 2: 進階功能與成本優化 (3-4 週)
#### Week 7-8: 智能優化
- [ ] 提示詞優化引擎
- [ ] 圖片品質自動檢測
- [ ] 語意去重機制
- [ ] 預生成策略實現
#### Week 8-9: 智能優化與成本控制
- [ ] 階段性積分扣款系統
- [ ] 智能提示詞優化引擎 (Gemini→Replicate)
- [ ] 相似性檢測與快取共享
- [ ] 預生成策略 (熱門詞彙描述)
- [ ] 圖片品質自動評分
#### Week 9-10: 管理功能
- [ ] 管理後台圖片審核
- [ ] 成本統計與報表
#### Week 10-11: 管理功能與監控
- [ ] 兩階段成本統計與報表
- [ ] 管理後台 (Gemini 描述審核 + 圖片審核)
- [ ] 用戶積分系統整合
- [ ] 監控告警系統
- [ ] 分階段監控告警 (Gemini 失敗率、Replicate 超時)
- [ ] 效能分析儀表板
### 9.3 Phase 3: 上線與優化 (2-3 週)
### 9.3 Phase 3: 生產部署與擴展 (2-3 週)
#### Week 11-12: 生產部署
- [ ] 雲端儲存服務配置
- [ ] CDN 設定與優化
#### Week 12-13: 生產環境部署
- [ ] Replicate API 生產環境配置
- [ ] 雲端儲存服務配置與 CDN
- [ ] 兩階段生成的容錯與降級機制
- [ ] 生產環境部署測試
- [ ] 灰度發布與 A/B 測試
- [ ] 灰度發布 (先開放描述生成,再開放圖片生成)
#### Week 13: 後續優化
- [ ] 用戶回饋收集與分析
- [ ] 效能調優與成本優
- [ ] 新功能規劃
#### Week 14: 優化與擴展
- [ ] 用戶回饋收集與兩階段效果分析
- [ ] 成本效益分析與積分系統調
- [ ] 多模型支援擴展 (更多 Replicate 模型)
- [ ] 文檔完善與團隊培訓
### 9.4 技術風險時程調整
#### 高風險項目緩衝時間
- **Replicate API 整合複雜度**: +1 週
- **兩階段狀態同步機制**: +0.5 週
- **成本控制策略實現**: +0.5 週
- **快取匹配算法優化**: +1 週
#### 總預估時程: **10-14 週**
- **最樂觀**: 10 週 (無重大技術障礙)
- **實際預估**: 12 週 (包含常見問題處理)
- **保守估計**: 14 週 (包含風險緩衝)
## 10. 風險評估與應對策略
### 10.1 技術風險