# 選項詞彙庫功能規格書
**版本**: 1.0
**日期**: 2025-09-29
**專案**: DramaLing 智能英語學習系統
**功能模組**: 測驗選項生成系統
---
## 📋 功能概述
### 背景
目前 DramaLing 系統的測驗選項生成存在以下問題:
- **前端使用簡單佔位符**:`["其他選項1", "其他選項2", "其他選項3"]`
- **後端隨機選擇**:從用戶自己的詞卡中隨機選取,缺乏智能性
- **選項品質不穩定**:可能產生過於簡單或困難的干擾項
- **缺乏科學性**:未考慮語言學習的認知負荷理論
### 目標
建立一個**智能選項詞彙庫系統**,根據目標詞彙的特徵自動生成高品質的測驗干擾項。
### 核心特性
- **三參數匹配**:CEFR 等級、字數、詞性
- **智能篩選**:避免同義詞、相似拼寫等不合適的選項
- **可擴展性**:支援持續新增詞彙和優化演算法
- **效能優化**:透過索引和快取確保快速回應
---
## 🎯 功能需求
### 核心需求
| 需求ID | 描述 | 優先級 |
|--------|------|-------|
| REQ-001 | 根據 CEFR 等級匹配相近難度的詞彙 | 高 |
| REQ-002 | 根據字數(字元長度)匹配類似長度的詞彙 | 高 |
| REQ-003 | 根據詞性匹配相同詞性的詞彙 | 高 |
| REQ-004 | 每次生成 3 個不同的干擾項 | 高 |
| REQ-005 | 避免選擇同義詞作為干擾項 | 中 |
| REQ-006 | 避免選擇拼寫過於相似的詞彙 | 中 |
| REQ-007 | 支援多種測驗類型(詞彙選擇、聽力等) | 中 |
| REQ-008 | 提供詞彙庫管理介面 | 低 |
### 非功能需求
| 需求ID | 描述 | 指標 |
|--------|------|-------|
| NFR-001 | 回應時間 | < 100ms |
| NFR-002 | 詞彙庫大小 | 初期 ≥ 10,000 詞 |
| NFR-003 | 可用性 | 99.9% |
| NFR-004 | 擴展性 | 支援 100,000+ 詞彙 |
---
## 🏗️ 系統設計
### 整體架構
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 前端測驗頁面 │────│ 選項生成API │────│ 詞彙庫服務 │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ 快取層 │ │ 選項詞彙庫 │
│ (Redis/Memory) │ │ (Database) │
└─────────────────┘ └─────────────────┘
```
### 核心元件
1. **OptionsVocabulary 實體** - 詞彙庫資料模型
2. **OptionsVocabularyService** - 詞彙庫業務邏輯
3. **DistractorGenerationService** - 干擾項生成邏輯
4. **VocabularyMatchingEngine** - 詞彙匹配演算法
---
## 📊 資料模型設計
### OptionsVocabulary 實體
```csharp
namespace DramaLing.Api.Models.Entities;
public class OptionsVocabulary
{
///
/// 主鍵
///
public Guid Id { get; set; }
///
/// 詞彙內容
///
[Required]
[MaxLength(100)]
[Index("IX_OptionsVocabulary_Word", IsUnique = true)]
public string Word { get; set; } = string.Empty;
///
/// CEFR 難度等級 (A1, A2, B1, B2, C1, C2)
///
[Required]
[MaxLength(2)]
[Index("IX_OptionsVocabulary_CEFR")]
public string CEFRLevel { get; set; } = string.Empty;
///
/// 詞性 (noun, verb, adjective, adverb, etc.)
///
[Required]
[MaxLength(20)]
[Index("IX_OptionsVocabulary_PartOfSpeech")]
public string PartOfSpeech { get; set; } = string.Empty;
///
/// 字數(字元長度)
///
[Index("IX_OptionsVocabulary_WordLength")]
public int WordLength { get; set; }
///
/// 詞彙使用頻率(1-5,5最高)
///
[Range(1, 5)]
public int FrequencyRating { get; set; } = 3;
///
/// 中文翻譯(用於避免同義詞)
///
[MaxLength(200)]
public string? ChineseTranslation { get; set; }
///
/// 同義詞列表(JSON 格式,用於排除)
///
[MaxLength(500)]
public string? Synonyms { get; set; }
///
/// 詞彙來源(dictionary, corpus, manual)
///
[MaxLength(50)]
public string? Source { get; set; }
///
/// 是否啟用
///
public bool IsActive { get; set; } = true;
///
/// 品質評分(用於優先排序)
///
[Range(0, 100)]
public int QualityScore { get; set; } = 50;
///
/// 創建時間
///
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
///
/// 更新時間
///
public DateTime UpdatedAt { get; set; } = DateTime.UtcNow;
}
```
### 複合索引設計
```csharp
// 在 DbContext 中配置
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// 核心查詢索引:CEFR + 詞性 + 字數
modelBuilder.Entity()
.HasIndex(e => new { e.CEFRLevel, e.PartOfSpeech, e.WordLength })
.HasDatabaseName("IX_OptionsVocabulary_Core_Matching");
// 啟用狀態索引
modelBuilder.Entity()
.HasIndex(e => new { e.IsActive, e.QualityScore })
.HasDatabaseName("IX_OptionsVocabulary_Active_Quality");
}
```
---
## 🔧 服務層設計
### IOptionsVocabularyService 介面
```csharp
namespace DramaLing.Api.Services;
public interface IOptionsVocabularyService
{
///
/// 根據目標詞彙生成干擾項
///
Task> GenerateDistractorsAsync(
string targetWord,
string cefrLevel,
string partOfSpeech,
int count = 3);
///
/// 新增詞彙到選項庫
///
Task AddVocabularyAsync(OptionsVocabulary vocabulary);
///
/// 批量匯入詞彙
///
Task BulkImportAsync(List vocabularies);
///
/// 根據條件搜尋詞彙
///
Task> SearchVocabulariesAsync(
string? cefrLevel = null,
string? partOfSpeech = null,
int? minLength = null,
int? maxLength = null,
int limit = 100);
}
```
### DistractorGenerationService 核心邏輯
```csharp
public class DistractorGenerationService
{
private readonly DramaLingDbContext _context;
private readonly IMemoryCache _cache;
private readonly ILogger _logger;
public async Task> GenerateDistractorsAsync(
string targetWord,
string cefrLevel,
string partOfSpeech)
{
var targetLength = targetWord.Length;
// 1. 基礎篩選條件
var baseQuery = _context.OptionsVocabularies
.Where(v => v.IsActive && v.Word != targetWord);
// 2. CEFR 等級匹配(相同等級 + 相鄰等級)
var allowedCEFRLevels = GetAllowedCEFRLevels(cefrLevel);
baseQuery = baseQuery.Where(v => allowedCEFRLevels.Contains(v.CEFRLevel));
// 3. 詞性匹配
baseQuery = baseQuery.Where(v => v.PartOfSpeech == partOfSpeech);
// 4. 字數匹配(±2 字元範圍)
var minLength = Math.Max(1, targetLength - 2);
var maxLength = targetLength + 2;
baseQuery = baseQuery.Where(v => v.WordLength >= minLength && v.WordLength <= maxLength);
// 5. 排除同義詞(如果有定義)
// TODO: 實現同義詞排除邏輯
// 6. 按品質評分和隨機性排序
var candidates = await baseQuery
.OrderByDescending(v => v.QualityScore)
.ThenBy(v => Guid.NewGuid())
.Take(10) // 取更多候選詞再篩選
.Select(v => v.Word)
.ToListAsync();
// 7. 最終篩選和回傳
return candidates.Take(3).ToList();
}
private List GetAllowedCEFRLevels(string targetLevel)
{
var levels = new[] { "A1", "A2", "B1", "B2", "C1", "C2" };
var targetIndex = Array.IndexOf(levels, targetLevel);
if (targetIndex == -1) return new List { targetLevel };
var allowed = new List { targetLevel };
// 加入相鄰等級
if (targetIndex > 0) allowed.Add(levels[targetIndex - 1]);
if (targetIndex < levels.Length - 1) allowed.Add(levels[targetIndex + 1]);
return allowed;
}
}
```
---
## 🌐 API 設計
### 新增到 StudyController
```csharp
///
/// 生成測驗選項(使用詞彙庫)
///
[HttpGet("question-options/{flashcardId}")]
public async Task> GenerateQuestionOptions(
Guid flashcardId,
[FromQuery] string questionType = "vocab-choice")
{
try
{
var flashcard = await _context.Flashcards.FindAsync(flashcardId);
if (flashcard == null)
return NotFound(new { Error = "Flashcard not found" });
var options = await _distractorGenerationService.GenerateDistractorsAsync(
flashcard.Word,
flashcard.DifficultyLevel ?? "B1",
flashcard.PartOfSpeech ?? "noun");
// 加入正確答案並隨機打亂
var allOptions = new List { flashcard.Word };
allOptions.AddRange(options);
var shuffledOptions = allOptions.OrderBy(x => Guid.NewGuid()).ToArray();
return Ok(new QuestionOptionsResponse
{
QuestionType = questionType,
Options = shuffledOptions,
CorrectAnswer = flashcard.Word,
TargetWord = flashcard.Word,
CEFRLevel = flashcard.DifficultyLevel,
PartOfSpeech = flashcard.PartOfSpeech
});
}
catch (Exception ex)
{
_logger.LogError(ex, "Error generating question options for flashcard {FlashcardId}", flashcardId);
return StatusCode(500, new { Error = "Internal server error" });
}
}
```
### 詞彙庫管理 API
```csharp
///
/// 詞彙庫管理控制器
///
[ApiController]
[Route("api/[controller]")]
[Authorize(Roles = "Admin")]
public class OptionsVocabularyController : ControllerBase
{
private readonly IOptionsVocabularyService _vocabularyService;
///
/// 新增詞彙到選項庫
///
[HttpPost]
public async Task AddVocabulary([FromBody] AddVocabularyRequest request)
{
var vocabulary = new OptionsVocabulary
{
Word = request.Word,
CEFRLevel = request.CEFRLevel,
PartOfSpeech = request.PartOfSpeech,
WordLength = request.Word.Length,
FrequencyRating = request.FrequencyRating,
ChineseTranslation = request.ChineseTranslation,
Source = "manual"
};
var success = await _vocabularyService.AddVocabularyAsync(vocabulary);
return success ? Ok() : BadRequest();
}
///
/// 批量匯入詞彙
///
[HttpPost("bulk-import")]
public async Task BulkImport([FromBody] List requests)
{
var vocabularies = requests.Select(r => new OptionsVocabulary
{
Word = r.Word,
CEFRLevel = r.CEFRLevel,
PartOfSpeech = r.PartOfSpeech,
WordLength = r.Word.Length,
FrequencyRating = r.FrequencyRating,
ChineseTranslation = r.ChineseTranslation,
Source = "bulk-import"
}).ToList();
var importedCount = await _vocabularyService.BulkImportAsync(vocabularies);
return Ok(new { ImportedCount = importedCount });
}
///
/// 搜尋詞彙庫
///
[HttpGet("search")]
public async Task>> SearchVocabularies(
[FromQuery] string? cefrLevel = null,
[FromQuery] string? partOfSpeech = null,
[FromQuery] int? minLength = null,
[FromQuery] int? maxLength = null,
[FromQuery] int limit = 100)
{
var vocabularies = await _vocabularyService.SearchVocabulariesAsync(
cefrLevel, partOfSpeech, minLength, maxLength, limit);
return Ok(vocabularies);
}
}
```
---
## 📁 DTOs 定義
### QuestionOptionsResponse
```csharp
namespace DramaLing.Api.Models.DTOs;
public class QuestionOptionsResponse
{
public string QuestionType { get; set; } = string.Empty;
public string[] Options { get; set; } = Array.Empty();
public string CorrectAnswer { get; set; } = string.Empty;
public string TargetWord { get; set; } = string.Empty;
public string? CEFRLevel { get; set; }
public string? PartOfSpeech { get; set; }
public DateTime GeneratedAt { get; set; } = DateTime.UtcNow;
}
```
### AddVocabularyRequest
```csharp
public class AddVocabularyRequest
{
[Required]
[MaxLength(100)]
public string Word { get; set; } = string.Empty;
[Required]
[RegularExpression("^(A1|A2|B1|B2|C1|C2)$")]
public string CEFRLevel { get; set; } = string.Empty;
[Required]
[MaxLength(20)]
public string PartOfSpeech { get; set; } = string.Empty;
[Range(1, 5)]
public int FrequencyRating { get; set; } = 3;
[MaxLength(200)]
public string? ChineseTranslation { get; set; }
[MaxLength(500)]
public string? Synonyms { get; set; }
}
```
---
## 💾 資料庫遷移
### Migration 檔案
```csharp
public partial class AddOptionsVocabularyTable : Migration
{
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.CreateTable(
name: "OptionsVocabularies",
columns: table => new
{
Id = table.Column(nullable: false),
Word = table.Column(maxLength: 100, nullable: false),
CEFRLevel = table.Column(maxLength: 2, nullable: false),
PartOfSpeech = table.Column(maxLength: 20, nullable: false),
WordLength = table.Column(nullable: false),
FrequencyRating = table.Column(nullable: false, defaultValue: 3),
ChineseTranslation = table.Column(maxLength: 200, nullable: true),
Synonyms = table.Column(maxLength: 500, nullable: true),
Source = table.Column(maxLength: 50, nullable: true),
IsActive = table.Column(nullable: false, defaultValue: true),
QualityScore = table.Column(nullable: false, defaultValue: 50),
CreatedAt = table.Column(nullable: false),
UpdatedAt = table.Column(nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_OptionsVocabularies", x => x.Id);
});
// 索引
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Word",
table: "OptionsVocabularies",
column: "Word",
unique: true);
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Core_Matching",
table: "OptionsVocabularies",
columns: new[] { "CEFRLevel", "PartOfSpeech", "WordLength" });
migrationBuilder.CreateIndex(
name: "IX_OptionsVocabulary_Active_Quality",
table: "OptionsVocabularies",
columns: new[] { "IsActive", "QualityScore" });
}
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.DropTable(name: "OptionsVocabularies");
}
}
```
---
## 🔄 使用案例
### 案例 1:詞彙選擇題
```
目標詞彙: "beautiful" (B1, adjective, 9字元)
篩選條件:
- CEFR: A2, B1, B2 (相鄰等級)
- 詞性: adjective
- 字數: 7-11 字元
可能的干擾項:
- "wonderful" (B1, adjective, 9字元)
- "excellent" (B2, adjective, 9字元)
- "attractive" (B2, adjective, 10字元)
最終選項: ["beautiful", "wonderful", "excellent", "attractive"]
```
### 案例 2:聽力測驗
```
目標詞彙: "running" (A2, verb, 7字元)
篩選條件:
- CEFR: A1, A2, B1
- 詞性: verb
- 字數: 5-9 字元
可能的干擾項:
- "jumping" (A2, verb, 7字元)
- "walking" (A1, verb, 7字元)
- "playing" (A2, verb, 7字元)
最終選項: ["running", "jumping", "walking", "playing"]
```
---
## ⚡ 效能考量
### 查詢優化
1. **複合索引**:(CEFRLevel, PartOfSpeech, WordLength)
2. **覆蓋索引**:包含常用查詢欄位
3. **分頁查詢**:避免一次載入過多資料
### 快取策略
```csharp
public class CachedDistractorGenerationService
{
private readonly IMemoryCache _cache;
private readonly TimeSpan _cacheExpiry = TimeSpan.FromHours(1);
public async Task> GenerateDistractorsAsync(string targetWord, string cefrLevel, string partOfSpeech)
{
var cacheKey = $"distractors:{targetWord}:{cefrLevel}:{partOfSpeech}";
if (_cache.TryGetValue(cacheKey, out List cachedResult))
{
return cachedResult;
}
var result = await GenerateDistractorsInternalAsync(targetWord, cefrLevel, partOfSpeech);
_cache.Set(cacheKey, result, _cacheExpiry);
return result;
}
}
```
### 效能指標
| 指標 | 目標值 | 監控方式 |
|------|--------|----------|
| API 回應時間 | < 100ms | Application Insights |
| 資料庫查詢時間 | < 50ms | EF Core 日誌 |
| 快取命中率 | > 80% | 自訂計數器 |
| 併發請求數 | > 1000 req/s | 負載測試 |
---
## 📊 初始資料建立
### 資料來源建議
1. **CEFR 詞彙表**
- Cambridge English Vocabulary Profile
- Oxford 3000/5000 詞彙表
- 各級別教材詞彙表
2. **詞性標注**
- WordNet 資料庫
- 英語詞性詞典
- 語料庫分析結果
3. **頻率評級**
- Google Ngram Corpus
- Brown Corpus
- 現代英語使用頻率統計
### 初始資料腳本
```csharp
public class VocabularySeeder
{
public async Task SeedInitialVocabularyAsync()
{
var vocabularies = new List
{
// A1 Level - 名詞
new() { Word = "cat", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 3, FrequencyRating = 5, ChineseTranslation = "貓" },
new() { Word = "dog", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 3, FrequencyRating = 5, ChineseTranslation = "狗" },
new() { Word = "book", CEFRLevel = "A1", PartOfSpeech = "noun", WordLength = 4, FrequencyRating = 5, ChineseTranslation = "書" },
// A1 Level - 動詞
new() { Word = "eat", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 3, FrequencyRating = 5, ChineseTranslation = "吃" },
new() { Word = "run", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 3, FrequencyRating = 4, ChineseTranslation = "跑" },
new() { Word = "walk", CEFRLevel = "A1", PartOfSpeech = "verb", WordLength = 4, FrequencyRating = 5, ChineseTranslation = "走" },
// B1 Level - 形容詞
new() { Word = "beautiful", CEFRLevel = "B1", PartOfSpeech = "adjective", WordLength = 9, FrequencyRating = 4, ChineseTranslation = "美麗的" },
new() { Word = "wonderful", CEFRLevel = "B1", PartOfSpeech = "adjective", WordLength = 9, FrequencyRating = 4, ChineseTranslation = "精彩的" },
new() { Word = "excellent", CEFRLevel = "B2", PartOfSpeech = "adjective", WordLength = 9, FrequencyRating = 4, ChineseTranslation = "優秀的" },
// ... 更多詞彙
};
await _context.OptionsVocabularies.AddRangeAsync(vocabularies);
await _context.SaveChangesAsync();
}
}
```
---
## 🔄 服務註冊
### Startup.cs / Program.cs
```csharp
// 註冊服務
builder.Services.AddScoped();
builder.Services.AddScoped();
// 記憶體快取
builder.Services.AddMemoryCache();
// 背景服務(可選)
builder.Services.AddHostedService();
```
---
## 📈 品質保證
### 演算法驗證
1. **A/B 測試**:比較新舊選項生成方式的學習效果
2. **專家評審**:語言學習專家評估選項品質
3. **用戶回饋**:收集學習者對選項難度的反饋
### 監控指標
```csharp
public class DistractorQualityMetrics
{
public double AverageResponseTime { get; set; }
public double OptionVariability { get; set; } // 選項多樣性
public double CEFRLevelAccuracy { get; set; } // CEFR 匹配準確度
public double UserSatisfactionScore { get; set; } // 用戶滿意度
public int TotalDistractorsGenerated { get; set; }
public DateTime MeasuredAt { get; set; }
}
```
---
## 🚀 實作階段規劃
### Phase 1: 基礎實作 (1-2 週)
- [ ] 建立 OptionsVocabulary 實體和資料庫遷移
- [ ] 實作 OptionsVocabularyService 基礎功能
- [ ] 建立核心 API 端點
- [ ] 匯入初始詞彙資料(1000-5000 詞)
### Phase 2: 演算法優化 (1 週)
- [ ] 實作 DistractorGenerationService
- [ ] 新增同義詞排除邏輯
- [ ] 實作品質評分系統
- [ ] 加入快取機制
### Phase 3: 前端整合 (3-5 天)
- [ ] 修改前端 generateOptions 函數
- [ ] 整合新的 API 端點
- [ ] 測試各種測驗類型
- [ ] 效能測試和優化
### Phase 4: 進階功能 (1-2 週)
- [ ] 管理介面開發
- [ ] 批量匯入工具
- [ ] 監控和分析儀表板
- [ ] A/B 測試框架
---
## 📋 驗收標準
### 功能驗收
- [ ] 能根據 CEFR、詞性、字數生成合適的干擾項
- [ ] API 回應時間 < 100ms
- [ ] 生成的選項無重複
- [ ] 支援各種測驗類型
### 品質驗收
- [ ] 干擾項難度適中(不會太簡單或太困難)
- [ ] 無明顯的同義詞作為干擾項
- [ ] 拼寫差異合理(避免過於相似)
### 技術驗收
- [ ] 程式碼覆蓋率 > 80%
- [ ] 通過所有單元測試
- [ ] API 文檔完整
- [ ] 效能測試通過
---
## 🔒 安全性考量
### 資料保護
- 詞彙庫資料非敏感性,無特殊加密需求
- 管理 API 需要管理員權限驗證
- 防止 SQL 注入攻擊
### API 安全
- 實作 Rate Limiting 防止濫用
- 輸入驗證和清理
- 錯誤訊息不洩露系統資訊
---
## 📚 相關文件
- [智能複習系統-第五階段開發計劃.md](./智能複習系統-第五階段開發計劃.md)
- [後端完成度評估報告.md](./後端完成度評估報告.md)
- [DramaLing API 文檔](./docs/api-documentation.md)
---
**規格書完成日期**: 2025-09-29
**下次更新時間**: 實作完成後