dramaling-vocab-learning/選項詞彙庫功能規格書.md

751 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 選項詞彙庫功能規格書
**版本**: 1.0
**日期**: 2025-09-29
**專案**: DramaLing 智能英語學習系統
**功能模組**: 測驗選項生成系統
---
## 📋 功能概述
### 背景
目前 DramaLing 系統的測驗選項生成存在以下問題:
- **前端使用簡單佔位符**`["其他選項1", "其他選項2", "其他選項3"]`
- **後端隨機選擇**:從用戶自己的詞卡中隨機選取,缺乏智能性
- **選項品質不穩定**:可能產生過於簡單或困難的干擾項
- **缺乏科學性**:未考慮語言學習的認知負荷理論
### 目標
建立一個**智能選項詞彙庫系統**,根據目標詞彙的特徵自動生成高品質的測驗干擾項。
### 核心特性
- **三參數匹配**CEFR 等級、字數、詞性
- **智能篩選**:避免同義詞、相似拼寫等不合適的選項
- **可擴展性**:支援持續新增詞彙和優化演算法
- **效能優化**:透過索引和快取確保快速回應
---
## 🎯 功能需求
### 核心需求
| 需求ID | 描述 | 優先級 |
|--------|------|-------|
| REQ-001 | 根據 CEFR 等級匹配相近難度的詞彙 | 高 |
| REQ-002 | 根據字數(字元長度)匹配類似長度的詞彙 | 高 |
| REQ-003 | 根據詞性匹配相同詞性的詞彙 | 高 |
| REQ-004 | 每次生成 3 個不同的干擾項 | 高 |
| REQ-005 | 支援多種測驗類型(詞彙選擇、聽力等) | 中 |
| REQ-006 | 提供詞彙庫管理介面 | 低 |
> **設計簡化說明**:為降低維護成本和實作複雜度,移除了同義詞排除、品質評分、頻率評級等進階功能。專注於三參數匹配的核心功能,確保系統簡潔實用。
### 非功能需求
| 需求ID | 描述 | 指標 |
|--------|------|-------|
| NFR-001 | 回應時間 | < 100ms |
| NFR-002 | 詞彙庫大小 | 初期 10,000 |
| NFR-003 | 可用性 | 99.9% |
| NFR-004 | 擴展性 | 支援 100,000+ 詞彙 |
---
## 🏗️ 系統設計
### 整體架構
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 前端測驗頁面 │────│ 選項生成API │────│ 詞彙庫服務 │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ 快取層 │ │ 選項詞彙庫 │
│ (Redis/Memory) │ │ (Database) │
└─────────────────┘ └─────────────────┘
```
### 核心元件
1. **OptionsVocabulary 實體** - 詞彙庫資料模型
2. **OptionsVocabularyService** - 詞彙庫業務邏輯
3. **DistractorGenerationService** - 干擾項生成邏輯
4. **VocabularyMatchingEngine** - 詞彙匹配演算法
---
## 📊 資料模型設計
### OptionsVocabulary 實體
```csharp
namespace DramaLing.Api.Models.Entities;
public class OptionsVocabulary
{
/// <summary>
/// 主鍵
/// </summary>
public Guid Id { get; set; }
/// <summary>
/// 詞彙內容
/// </summary>
[Required]
[MaxLength(100)]
[Index("IX_OptionsVocabulary_Word", IsUnique = true)]
public string Word { get; set; } = string.Empty;
/// <summary>
/// CEFR 難度等級 (A1, A2, B1, B2, C1, C2)
/// </summary>
[Required]
[MaxLength(2)]
[Index("IX_OptionsVocabulary_CEFR")]
public string CEFRLevel { get; set; } = string.Empty;
/// <summary>
/// 詞性 (noun, verb, adjective, adverb, pronoun, preposition, conjunction, interjection, idiom)
/// </summary>
[Required]
[MaxLength(20)]
[RegularExpression("^(noun|verb|adjective|adverb|pronoun|preposition|conjunction|interjection|idiom)$",
ErrorMessage = "詞性必須為有效值")]
[Index("IX_OptionsVocabulary_PartOfSpeech")]
public string PartOfSpeech { get; set; } = string.Empty;
/// <summary>
/// 字數(字元長度)- 自動從 Word 計算
/// </summary>
[Index("IX_OptionsVocabulary_WordLength")]
public int WordLength { get; set; }
/// <summary>
/// 是否啟用
/// </summary>
public bool IsActive { get; set; } = true;
/// <summary>
/// 創建時間
/// </summary>
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
/// <summary>
/// 更新時間
/// </summary>
public DateTime UpdatedAt { get; set; } = DateTime.UtcNow;
}
```
### 複合索引設計
```csharp
// 在 DbContext 中配置
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// 核心查詢索引CEFR + 詞性 + 字數
modelBuilder.Entity<OptionsVocabulary>()
.HasIndex(e => new { e.CEFRLevel, e.PartOfSpeech, e.WordLength })
.HasDatabaseName("IX_OptionsVocabulary_Core_Matching");
// 啟用狀態索引
modelBuilder.Entity<OptionsVocabulary>()
.HasIndex(e => e.IsActive)
.HasDatabaseName("IX_OptionsVocabulary_Active");
}
```
---
## 🔧 服務層設計
### IOptionsVocabularyService 介面
```csharp
namespace DramaLing.Api.Services;
public interface IOptionsVocabularyService
{
/// <summary>
/// 根據目標詞彙生成干擾項
/// </summary>
Task<List<string>> GenerateDistractorsAsync(
string targetWord,
string cefrLevel,
string partOfSpeech,
int count = 3);
/// <summary>
/// 新增詞彙到選項庫
/// </summary>
Task<bool> AddVocabularyAsync(OptionsVocabulary vocabulary);
/// <summary>
/// 批量匯入詞彙
/// </summary>
Task<int> BulkImportAsync(List<OptionsVocabulary> vocabularies);
/// <summary>
/// 根據條件搜尋詞彙
/// </summary>
Task<List<OptionsVocabulary>> SearchVocabulariesAsync(
string? cefrLevel = null,
string? partOfSpeech = null,
int? minLength = null,
int? maxLength = null,
int limit = 100);
}
```
### QuestionGeneratorService 整合設計
```csharp
public class QuestionGeneratorService : IQuestionGeneratorService
{
private readonly DramaLingDbContext _context;
private readonly IOptionsVocabularyService _optionsVocabularyService;
private readonly ILogger<QuestionGeneratorService> _logger;
public QuestionGeneratorService(
DramaLingDbContext context,
IOptionsVocabularyService optionsVocabularyService,
ILogger<QuestionGeneratorService> logger)
{
_context = context;
_optionsVocabularyService = optionsVocabularyService;
_logger = logger;
}
/// <summary>
/// 生成詞彙選擇題選項(整合選項詞彙庫)
/// </summary>
private async Task<QuestionData> GenerateVocabChoiceAsync(Flashcard flashcard)
{
try
{
// 優先使用選項詞彙庫生成干擾項
var distractors = await _optionsVocabularyService.GenerateDistractorsAsync(
flashcard.Word,
flashcard.DifficultyLevel ?? "B1",
flashcard.PartOfSpeech ?? "noun");
// 如果詞彙庫沒有足夠的選項,回退到用戶其他詞卡
if (distractors.Count < 3)
{
var fallbackDistractors = await GetFallbackDistractorsAsync(flashcard);
distractors.AddRange(fallbackDistractors.Take(3 - distractors.Count));
}
var options = new List<string> { flashcard.Word };
options.AddRange(distractors.Take(3));
// 隨機打亂選項順序
var shuffledOptions = options.OrderBy(x => Guid.NewGuid()).ToArray();
return new QuestionData
{
QuestionType = "vocab-choice",
Options = shuffledOptions,
CorrectAnswer = flashcard.Word
};
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to generate options from vocabulary database, using fallback for {Word}", flashcard.Word);
// 完全回退到原有邏輯
return await GenerateVocabChoiceWithFallbackAsync(flashcard);
}
}
/// <summary>
/// 回退選項生成(使用用戶其他詞卡)
/// </summary>
private async Task<List<string>> GetFallbackDistractorsAsync(Flashcard flashcard)
{
return await _context.Flashcards
.Where(f => f.UserId == flashcard.UserId &&
f.Id != flashcard.Id &&
!f.IsArchived)
.OrderBy(x => Guid.NewGuid())
.Take(3)
.Select(f => f.Word)
.ToListAsync();
}
}
```
---
## 🌐 API 設計
### 整合到現有 FlashcardsController
選項詞彙庫功能將整合到現有的 `POST /api/flashcards/{id}/question` API 端點中
```csharp
// 現有的 FlashcardsController.GenerateQuestion 方法會自動使用改進後的 QuestionGeneratorService
// 不需要新增額外的 API 端點
[HttpPost("{id}/question")]
public async Task<ActionResult> GenerateQuestion(Guid id, [FromBody] QuestionRequest request)
{
try
{
// QuestionGeneratorService 內部會使用 OptionsVocabularyService 生成更好的選項
var questionData = await _questionGeneratorService.GenerateQuestionAsync(id, request.QuestionType);
return Ok(new { success = true, data = questionData });
}
catch (Exception ex)
{
_logger.LogError(ex, "Error generating question for flashcard {FlashcardId}", id);
return StatusCode(500, new { success = false, error = "Failed to generate question" });
}
}
```
### 詞彙庫管理 API選用功能
> **注意**:以下管理 API 為選用功能,主要供管理員批量管理詞彙庫使用。
> 核心選項生成功能已整合到現有的測驗 API 中,不依賴這些管理端點。
```csharp
/// <summary>
/// 詞彙庫管理控制器(選用)
/// 僅在需要管理員批量管理詞彙庫時實作
/// </summary>
[ApiController]
[Route("api/admin/[controller]")]
[Authorize(Roles = "Admin")]
public class OptionsVocabularyController : ControllerBase
{
private readonly IOptionsVocabularyService _vocabularyService;
/// <summary>
/// 批量匯入詞彙(管理員功能)
/// </summary>
[HttpPost("bulk-import")]
public async Task<ActionResult> BulkImport([FromBody] List<AddVocabularyRequest> requests)
{
var vocabularies = requests.Select(r => new OptionsVocabulary
{
Word = r.Word,
CEFRLevel = r.CEFRLevel,
PartOfSpeech = r.PartOfSpeech,
WordLength = r.Word.Length
}).ToList();
var importedCount = await _vocabularyService.BulkImportAsync(vocabularies);
return Ok(new { ImportedCount = importedCount });
}
/// <summary>
/// 搜尋詞彙庫統計(管理員功能)
/// </summary>
[HttpGet("stats")]
public async Task<ActionResult> GetVocabularyStats()
{
var stats = await _vocabularyService.GetVocabularyStatsAsync();
return Ok(stats);
}
}
```
---
## 📁 DTOs 定義
### QuestionOptionsResponse
```csharp
namespace DramaLing.Api.Models.DTOs;
public class QuestionOptionsResponse
{
public string QuestionType { get; set; } = string.Empty;
public string[] Options { get; set; } = Array.Empty<string>();
public string CorrectAnswer { get; set; } = string.Empty;
public string TargetWord { get; set; } = string.Empty;
public string? CEFRLevel { get; set; }
public string? PartOfSpeech { get; set; }
public DateTime GeneratedAt { get; set; } = DateTime.UtcNow;
}
```
### AddVocabularyRequest
```csharp
public class AddVocabularyRequest
{
[Required]
[MaxLength(100)]
public string Word { get; set; } = string.Empty;
[Required]
[RegularExpression("^(A1|A2|B1|B2|C1|C2)$")]
public string CEFRLevel { get; set; } = string.Empty;
[Required]
[MaxLength(20)]
[RegularExpression("^(noun|verb|adjective|adverb|pronoun|preposition|conjunction|interjection|idiom)$",
ErrorMessage = "詞性必須為有效值")]
public string PartOfSpeech { get; set; } = string.Empty;
}
```
---
## 💾 資料庫遷移
### Migration 檔案
```csharp
public partial class AddOptionsVocabularyTable : Migration
{
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.CreateTable(
name: "OptionsVocabularies",
columns: table => new
{
Id = table.Column<Guid>(nullable: false),
Word = table.Column<string>(maxLength: 100, nullable: false),
CEFRLevel = table.Column<string>(maxLength: 2, nullable: false),
PartOfSpeech = table.Column<string>(maxLength: 20, nullable: false),
WordLength = table.Column<int>(nullable: false),
IsActive = table.Column<bool>(nullable: false, defaultValue: true),
CreatedAt = table.Column<DateTime>(nullable: false),
UpdatedAt = table.Column<DateTime>(nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_OptionsVocabularies", x => x.Id);
});
// 索引
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Word",
table: "OptionsVocabularies",
column: "Word",
unique: true);
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Core_Matching",
table: "OptionsVocabularies",
columns: new[] { "CEFRLevel", "PartOfSpeech", "WordLength" });
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Active",
table: "OptionsVocabularies",
column: "IsActive");
}
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.DropTable(name: "OptionsVocabularies");
}
}
```
---
## 🔄 使用案例
### 案例 1詞彙選擇題 API 流程
```
前端請求:
POST /api/flashcards/{id}/question
{
"questionType": "vocab-choice"
}
後端處理:
1. 查詢詞卡: "beautiful" (B1, adjective, 9字元)
2. 從選項詞彙庫篩選干擾項:
- CEFR: A2, B1, B2 (相鄰等級)
- 詞性: adjective
- 字數: 7-11 字元
3. 選出干擾項: ["wonderful", "excellent", "attractive"]
API 回應:
{
"success": true,
"data": {
"questionType": "vocab-choice",
"options": ["beautiful", "wonderful", "excellent", "attractive"],
"correctAnswer": "beautiful"
}
}
```
### 案例 2聽力測驗 API 流程
```
前端請求:
POST /api/flashcards/{id}/question
{
"questionType": "sentence-listening"
}
後端處理:
1. 查詢詞卡: "running" (A2, verb, 7字元)
2. 從選項詞彙庫篩選干擾項:
- CEFR: A1, A2, B1
- 詞性: verb
- 字數: 5-9 字元
3. 選出干擾項: ["jumping", "walking", "playing"]
API 回應:
{
"success": true,
"data": {
"questionType": "sentence-listening",
"options": ["running", "jumping", "walking", "playing"],
"correctAnswer": "running"
}
}
```
### 案例 3回退機制
```
情境: 詞彙庫中沒有足夠的相符選項
處理流程:
1. 嘗試從選項詞彙庫獲取干擾項 → 只找到 1 個
2. 啟動回退機制:從用戶其他詞卡補足 2 個選項
3. 確保總是能提供 3 個干擾項
優點:確保系統穩定性,即使詞彙庫不完整也能正常運作
```
---
## ⚡ 效能考量
### 查詢優化
1. **複合索引**(CEFRLevel, PartOfSpeech, WordLength)
2. **覆蓋索引**包含常用查詢欄位
3. **分頁查詢**避免一次載入過多資料
### 快取策略
```csharp
public class CachedDistractorGenerationService
{
private readonly IMemoryCache _cache;
private readonly TimeSpan _cacheExpiry = TimeSpan.FromHours(1);
public async Task<List<string>> GenerateDistractorsAsync(string targetWord, string cefrLevel, string partOfSpeech)
{
var cacheKey = $"distractors:{targetWord}:{cefrLevel}:{partOfSpeech}";
if (_cache.TryGetValue(cacheKey, out List<string> cachedResult))
{
return cachedResult;
}
var result = await GenerateDistractorsInternalAsync(targetWord, cefrLevel, partOfSpeech);
_cache.Set(cacheKey, result, _cacheExpiry);
return result;
}
}
```
### 效能指標
| 指標 | 目標值 | 監控方式 |
|------|--------|----------|
| API 回應時間 | < 100ms | Application Insights |
| 資料庫查詢時間 | < 50ms | EF Core 日誌 |
| 快取命中率 | > 80% | 自訂計數器 |
| 併發請求數 | > 1000 req/s | 負載測試 |
---
## 📊 初始資料建立
### 資料來源建議
1. **CEFR 詞彙表**
- Cambridge English Vocabulary Profile
- Oxford 3000/5000 詞彙表
- 各級別教材詞彙表
2. **詞性標注**
- WordNet 資料庫
- 英語詞性詞典
- 語料庫分析結果
3. **頻率評級**
- Google Ngram Corpus
- Brown Corpus
- 現代英語使用頻率統計
### 初始資料腳本
```csharp
public class VocabularySeeder
{
public async Task SeedInitialVocabularyAsync()
{
var vocabularies = new List<OptionsVocabulary>
{
// A1 Level - 名詞
new() { Word = "cat", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 3 },
new() { Word = "dog", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 3 },
new() { Word = "book", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 4 },
// A1 Level - 動詞
new() { Word = "eat", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 3 },
new() { Word = "run", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 3 },
new() { Word = "walk", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 4 },
// A1 Level - 代名詞
new() { Word = "he", CEFRLevel = "A1", PartOfSpeech = "pronoun", WordLength = 2 },
new() { Word = "she", CEFRLevel = "A1", PartOfSpeech = "pronoun", WordLength = 3 },
new() { Word = "they", CEFRLevel = "A1", PartOfSpeech = "pronoun", WordLength = 4 },
// A2 Level - 介系詞
new() { Word = "under", CEFRLevel = "A2", PartOfSpeech = "preposition", WordLength = 5 },
new() { Word = "above", CEFRLevel = "A2", PartOfSpeech = "preposition", WordLength = 5 },
new() { Word = "behind", CEFRLevel = "A2", PartOfSpeech = "preposition", WordLength = 6 },
// B1 Level - 形容詞
new() { Word = "beautiful", CEFRLevel = "B1", PartOfSpeech = "adjective", WordLength = 9 },
new() { Word = "wonderful", CEFRLevel = "B1", PartOfSpeech = "adjective", WordLength = 9 },
new() { Word = "excellent", CEFRLevel = "B2", PartOfSpeech = "adjective", WordLength = 9 },
// B1 Level - 副詞
new() { Word = "quickly", CEFRLevel = "B1", PartOfSpeech = "adverb", WordLength = 7 },
new() { Word = "carefully", CEFRLevel = "B1", PartOfSpeech = "adverb", WordLength = 9 },
new() { Word = "suddenly", CEFRLevel = "B1", PartOfSpeech = "adverb", WordLength = 8 },
// B2 Level - 連接詞
new() { Word = "however", CEFRLevel = "B2", PartOfSpeech = "conjunction", WordLength = 7 },
new() { Word = "therefore", CEFRLevel = "B2", PartOfSpeech = "conjunction", WordLength = 9 },
new() { Word = "although", CEFRLevel = "B2", PartOfSpeech = "conjunction", WordLength = 8 },
// 感嘆詞
new() { Word = "wow", CEFRLevel = "A1", PartOfSpeech = "interjection", WordLength = 3 },
new() { Word = "ouch", CEFRLevel = "A2", PartOfSpeech = "interjection", WordLength = 4 },
new() { Word = "alas", CEFRLevel = "C1", PartOfSpeech = "interjection", WordLength = 4 },
// 慣用語
new() { Word = "break the ice", CEFRLevel = "B2", PartOfSpeech = "idiom", WordLength = 12 },
new() { Word = "piece of cake", CEFRLevel = "B1", PartOfSpeech = "idiom", WordLength = 12 },
new() { Word = "hit the books", CEFRLevel = "B2", PartOfSpeech = "idiom", WordLength = 12 },
// ... 更多詞彙
};
await _context.OptionsVocabularies.AddRangeAsync(vocabularies);
await _context.SaveChangesAsync();
}
}
```
---
## 🔄 服務註冊
### Startup.cs / Program.cs
```csharp
// 註冊服務
builder.Services.AddScoped<IOptionsVocabularyService, OptionsVocabularyService>();
builder.Services.AddScoped<DistractorGenerationService>();
// 記憶體快取
builder.Services.AddMemoryCache();
// 背景服務(可選)
builder.Services.AddHostedService<VocabularyQualityScoreUpdateService>();
```
---
## 📈 品質保證
### 演算法驗證
1. **A/B 測試**:比較新舊選項生成方式的學習效果
2. **專家評審**:語言學習專家評估選項品質
3. **用戶回饋**:收集學習者對選項難度的反饋
### 監控指標
```csharp
public class DistractorQualityMetrics
{
public double AverageResponseTime { get; set; }
public double OptionVariability { get; set; } // 選項多樣性
public double CEFRLevelAccuracy { get; set; } // CEFR 匹配準確度
public double UserSatisfactionScore { get; set; } // 用戶滿意度
public int TotalDistractorsGenerated { get; set; }
public DateTime MeasuredAt { get; set; }
}
```
---
## 🚀 實作階段規劃
### Phase 1: 基礎實作 (1-2 週)
- [ ] 建立 OptionsVocabulary 實體和資料庫遷移
- [ ] 實作 OptionsVocabularyService 基礎功能
- [ ] 建立核心 API 端點
- [ ] 匯入初始詞彙資料1000-5000 詞)
### Phase 2: 演算法優化 (1 週)
- [ ] 實作 DistractorGenerationService
- [ ] 新增同義詞排除邏輯
- [ ] 實作品質評分系統
- [ ] 加入快取機制
### Phase 3: 前端整合 (1-2 天)
- [ ] 測試現有 API 端點的改進效果
- [ ] 驗證各種測驗類型的選項品質
- [ ] 效能測試和優化
> **注意**:由於選項生成功能已整合到現有 API前端不需要修改任何程式碼。
> 只需要確保後端改進後的選項生成效果符合預期。
### Phase 4: 進階功能 (1-2 週)
- [ ] 管理介面開發
- [ ] 批量匯入工具
- [ ] 監控和分析儀表板
- [ ] A/B 測試框架
---
## 📋 驗收標準
### 功能驗收
- [ ] 能根據 CEFR、詞性、字數生成合適的干擾項
- [ ] API 回應時間 < 100ms
- [ ] 生成的選項無重複
- [ ] 支援各種測驗類型
### 品質驗收
- [ ] 干擾項難度適中不會太簡單或太困難
- [ ] 無明顯的同義詞作為干擾項
- [ ] 拼寫差異合理避免過於相似
### 技術驗收
- [ ] 程式碼覆蓋率 > 80%
- [ ] 通過所有單元測試
- [ ] API 文檔完整
- [ ] 效能測試通過
---
## 🔒 安全性考量
### 資料保護
- 詞彙庫資料非敏感性,無特殊加密需求
- 管理 API 需要管理員權限驗證
- 防止 SQL 注入攻擊
### API 安全
- 實作 Rate Limiting 防止濫用
- 輸入驗證和清理
- 錯誤訊息不洩露系統資訊
---
## 📚 相關文件
- [智能複習系統-第五階段開發計劃.md](./智能複習系統-第五階段開發計劃.md)
- [後端完成度評估報告.md](./後端完成度評估報告.md)
- [DramaLing API 文檔](./docs/api-documentation.md)
---
**規格書完成日期**: 2025-09-29
**下次更新時間**: 實作完成後