docs: 新增例句口說練習整合技術規格文檔
- 詳細規劃例句口說練習功能的前後端整合方案 - Microsoft Azure Speech Services 發音評估 API 整合設計 - 完整的 API 介面規格和資料庫 Schema 設計 - Web Audio API 錄音功能實現規格 - 複習系統 quizType 擴展方案 (sentence-speaking) - 多維度評分系統設計 (準確度/流暢度/完整度/韻律) - 成本分析和部署考量事項 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
fce5138c55
commit
99677fc014
|
|
@ -0,0 +1,745 @@
|
|||
# 例句口說練習整合規格
|
||||
|
||||
## 📋 概述
|
||||
|
||||
本文檔詳細規劃 DramaLing 詞彙學習系統中新增「例句口說練習」功能的完整技術規格,包含前端組件、後端 API、Microsoft Azure Speech Services 整合,以及系統架構設計。
|
||||
|
||||
---
|
||||
|
||||
## 🎯 功能目標
|
||||
|
||||
### 學習價值
|
||||
- **主動練習**: 從被動識別進階到主動口說輸出
|
||||
- **發音矯正**: 使用 AI 評估發音準確度和流暢度
|
||||
- **語境應用**: 在完整例句中練習單詞使用
|
||||
|
||||
### 用戶體驗
|
||||
- **視覺引導**: 顯示例句圖片幫助理解語境
|
||||
- **即時反饋**: 提供發音評分和改善建議
|
||||
- **無縫整合**: 與現有複習系統完美融合
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ 前端規格
|
||||
|
||||
### 現有組件分析
|
||||
|
||||
**文件位置**: `note/archive/components/review/review-tests/SentenceSpeakingTest.tsx`
|
||||
|
||||
**組件結構**:
|
||||
```typescript
|
||||
interface SentenceSpeakingTestProps extends BaseReviewProps {
|
||||
exampleImage?: string
|
||||
onImageClick?: (image: string) => void
|
||||
}
|
||||
|
||||
// 核心功能
|
||||
- 顯示例句圖片
|
||||
- 錄音按鈕 (🎤 開始錄音)
|
||||
- 目標例句顯示
|
||||
- 結果回饋區域
|
||||
```
|
||||
|
||||
### 前端功能升級需求
|
||||
|
||||
#### 1. **錄音功能實現**
|
||||
```typescript
|
||||
// 需要添加的功能
|
||||
interface AudioRecordingState {
|
||||
isRecording: boolean
|
||||
audioBlob: Blob | null
|
||||
recordingTime: number
|
||||
isProcessing: boolean
|
||||
}
|
||||
|
||||
// Web Audio API 錄音實現
|
||||
const startRecording = async () => {
|
||||
const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
|
||||
const mediaRecorder = new MediaRecorder(stream)
|
||||
// 實現錄音邏輯
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. **評分結果顯示**
|
||||
```typescript
|
||||
interface PronunciationResult {
|
||||
overallScore: number // 總分 (0-100)
|
||||
accuracyScore: number // 準確度
|
||||
fluencyScore: number // 流暢度
|
||||
completenessScore: number // 完整度
|
||||
prosodyScore: number // 韻律 (語調/節奏)
|
||||
feedback: string[] // 改善建議
|
||||
transcribedText: string // 語音轉文字結果
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. **UI 互動流程**
|
||||
1. 顯示例句圖片 + 目標例句
|
||||
2. 用戶點擊錄音按鈕 → 開始錄音 (顯示錄音動畫)
|
||||
3. 再次點擊 → 停止錄音 → 上傳音頻
|
||||
4. 顯示載入動畫 → 顯示評分結果
|
||||
5. 根據評分自動給出信心等級
|
||||
|
||||
---
|
||||
|
||||
## 🔧 後端規格
|
||||
|
||||
### Microsoft Azure Speech Services 整合
|
||||
|
||||
#### 1. **NuGet 套件需求**
|
||||
```xml
|
||||
<PackageReference Include="Microsoft.CognitiveServices.Speech" Version="1.38.0" />
|
||||
```
|
||||
|
||||
#### 2. **配置管理**
|
||||
```csharp
|
||||
public class AzureSpeechOptions
|
||||
{
|
||||
public const string SectionName = "AzureSpeech";
|
||||
public string SubscriptionKey { get; set; } = string.Empty;
|
||||
public string Region { get; set; } = "eastus";
|
||||
public string Language { get; set; } = "en-US";
|
||||
public bool EnableDetailedResult { get; set; } = true;
|
||||
public int TimeoutSeconds { get; set; } = 30;
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. **核心服務實現**
|
||||
```csharp
|
||||
public interface IPronunciationAssessmentService
|
||||
{
|
||||
Task<PronunciationResult> EvaluatePronunciationAsync(
|
||||
Stream audioStream,
|
||||
string referenceText,
|
||||
string language = "en-US"
|
||||
);
|
||||
}
|
||||
|
||||
public class AzurePronunciationAssessmentService : IPronunciationAssessmentService
|
||||
{
|
||||
// 實現 Azure Speech Services 整合
|
||||
public async Task<PronunciationResult> EvaluatePronunciationAsync(...)
|
||||
{
|
||||
// 1. 配置 Speech SDK
|
||||
var config = SpeechConfig.FromSubscription(apiKey, region);
|
||||
|
||||
// 2. 設置發音評估參數
|
||||
var pronunciationConfig = PronunciationAssessmentConfig.Create(
|
||||
referenceText,
|
||||
GradingSystem.HundredMark,
|
||||
Granularity.Phoneme
|
||||
);
|
||||
|
||||
// 3. 處理音頻流並獲取評估結果
|
||||
// 4. 轉換為統一的 PronunciationResult 格式
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🌐 API 設計規格
|
||||
|
||||
### 端點設計
|
||||
|
||||
#### **POST `/api/speech/pronunciation-assessment`**
|
||||
|
||||
**請求格式**:
|
||||
```http
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
audio: [音頻檔案] (WAV/MP3, 最大 10MB)
|
||||
referenceText: "He overstepped the boundaries of acceptable behavior."
|
||||
flashcardId: "b2bb23b8-16dd-44b2-bf64-34c468f2d362"
|
||||
language: "en-US" (可選,預設 en-US)
|
||||
```
|
||||
|
||||
**回應格式**:
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"data": {
|
||||
"assessmentId": "uuid-here",
|
||||
"flashcardId": "b2bb23b8-16dd-44b2-bf64-34c468f2d362",
|
||||
"referenceText": "He overstepped the boundaries...",
|
||||
"transcribedText": "He overstep the boundary of acceptable behavior",
|
||||
"scores": {
|
||||
"overall": 85,
|
||||
"accuracy": 82,
|
||||
"fluency": 88,
|
||||
"completeness": 90,
|
||||
"prosody": 80
|
||||
},
|
||||
"wordLevelResults": [
|
||||
{
|
||||
"word": "overstepped",
|
||||
"accuracy": 75,
|
||||
"errorType": "Mispronunciation"
|
||||
}
|
||||
],
|
||||
"feedback": [
|
||||
"發音整體表現良好",
|
||||
"注意 'overstepped' 的重音位置",
|
||||
"語速適中,語調自然"
|
||||
],
|
||||
"confidenceLevel": 2,
|
||||
"processingTime": "1.2s"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 錯誤處理
|
||||
|
||||
**常見錯誤回應**:
|
||||
```json
|
||||
{
|
||||
"success": false,
|
||||
"error": "AUDIO_TOO_SHORT",
|
||||
"message": "錄音時間太短,請至少錄製 1 秒",
|
||||
"details": {
|
||||
"minDuration": 1000,
|
||||
"actualDuration": 500
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**錯誤類型定義**:
|
||||
- `AUDIO_TOO_SHORT` - 錄音時間不足
|
||||
- `AUDIO_TOO_LONG` - 錄音時間過長 (>30秒)
|
||||
- `INVALID_AUDIO_FORMAT` - 音頻格式不支援
|
||||
- `SPEECH_SERVICE_ERROR` - Azure 服務錯誤
|
||||
- `NO_SPEECH_DETECTED` - 未檢測到語音
|
||||
|
||||
---
|
||||
|
||||
## 📊 資料庫設計
|
||||
|
||||
### 新增評估記錄表
|
||||
|
||||
```sql
|
||||
CREATE TABLE PronunciationAssessments (
|
||||
Id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),
|
||||
UserId UNIQUEIDENTIFIER NOT NULL,
|
||||
FlashcardId UNIQUEIDENTIFIER NOT NULL,
|
||||
ReferenceText NVARCHAR(500) NOT NULL,
|
||||
TranscribedText NVARCHAR(500),
|
||||
|
||||
-- 評分數據
|
||||
OverallScore DECIMAL(5,2),
|
||||
AccuracyScore DECIMAL(5,2),
|
||||
FluencyScore DECIMAL(5,2),
|
||||
CompletenessScore DECIMAL(5,2),
|
||||
ProsodyScore DECIMAL(5,2),
|
||||
|
||||
-- 元數據
|
||||
AudioDuration DECIMAL(8,3),
|
||||
ProcessingTime DECIMAL(8,3),
|
||||
AzureRequestId NVARCHAR(100),
|
||||
|
||||
CreatedAt DATETIME2 DEFAULT GETUTCDATE(),
|
||||
|
||||
-- 外鍵約束
|
||||
FOREIGN KEY (UserId) REFERENCES Users(Id),
|
||||
FOREIGN KEY (FlashcardId) REFERENCES Flashcards(Id)
|
||||
);
|
||||
|
||||
-- 索引優化
|
||||
CREATE INDEX IX_PronunciationAssessments_UserId_CreatedAt
|
||||
ON PronunciationAssessments(UserId, CreatedAt DESC);
|
||||
|
||||
CREATE INDEX IX_PronunciationAssessments_FlashcardId
|
||||
ON PronunciationAssessments(FlashcardId);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔄 系統整合規格
|
||||
|
||||
### 1. 複習系統擴展
|
||||
|
||||
#### **quizType 擴展**
|
||||
```typescript
|
||||
// hooks/review/useReviewSession.ts
|
||||
interface QuizItem {
|
||||
quizType: 'flip-card' | 'vocab-choice' | 'sentence-speaking'
|
||||
// ... 其他屬性保持不變
|
||||
}
|
||||
```
|
||||
|
||||
#### **題目生成邏輯更新**
|
||||
```typescript
|
||||
// 在 generateQuizItemsFromFlashcards 中添加
|
||||
quizItems.push(
|
||||
// 現有的 flip-card 和 vocab-choice...
|
||||
{
|
||||
id: `${card.id}-sentence-speaking`,
|
||||
cardId: card.id,
|
||||
cardData: cardState,
|
||||
quizType: 'sentence-speaking',
|
||||
order: order++,
|
||||
isCompleted: false,
|
||||
wrongCount: 0,
|
||||
skipCount: 0
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### 2. 評分邏輯映射
|
||||
|
||||
**Azure 評分 → 系統信心等級**:
|
||||
```typescript
|
||||
const mapAzureScoreToConfidence = (overallScore: number): number => {
|
||||
if (overallScore >= 85) return 2 // 優秀 (高信心)
|
||||
if (overallScore >= 70) return 1 // 良好 (中信心)
|
||||
return 0 // 需改善 (低信心)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ 技術實施規格
|
||||
|
||||
### 前端實施
|
||||
|
||||
#### 1. **音頻錄製實現**
|
||||
```typescript
|
||||
// components/shared/AudioRecorder.tsx (新增共用組件)
|
||||
export class AudioRecorder {
|
||||
private mediaRecorder: MediaRecorder | null = null
|
||||
private audioChunks: Blob[] = []
|
||||
|
||||
async startRecording(): Promise<void> {
|
||||
const stream = await navigator.mediaDevices.getUserMedia({
|
||||
audio: {
|
||||
echoCancellation: true,
|
||||
noiseSuppression: true,
|
||||
sampleRate: 16000 // Azure 推薦採樣率
|
||||
}
|
||||
})
|
||||
|
||||
this.mediaRecorder = new MediaRecorder(stream, {
|
||||
mimeType: 'audio/webm;codecs=opus' // 現代瀏覽器支援
|
||||
})
|
||||
// 實施錄音邏輯
|
||||
}
|
||||
|
||||
stopRecording(): Promise<Blob> {
|
||||
// 停止錄音並返回音頻 Blob
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. **API 客戶端**
|
||||
```typescript
|
||||
// lib/services/speechAssessment.ts
|
||||
export const speechAssessmentService = {
|
||||
async evaluatePronunciation(
|
||||
audioBlob: Blob,
|
||||
referenceText: string,
|
||||
flashcardId: string
|
||||
): Promise<PronunciationResult> {
|
||||
const formData = new FormData()
|
||||
formData.append('audio', audioBlob, 'recording.webm')
|
||||
formData.append('referenceText', referenceText)
|
||||
formData.append('flashcardId', flashcardId)
|
||||
|
||||
const response = await fetch('/api/speech/pronunciation-assessment', {
|
||||
method: 'POST',
|
||||
body: formData
|
||||
})
|
||||
|
||||
return response.json()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 後端實施
|
||||
|
||||
#### 1. **控制器實現**
|
||||
```csharp
|
||||
[ApiController]
|
||||
[Route("api/speech")]
|
||||
public class SpeechController : BaseController
|
||||
{
|
||||
private readonly IPronunciationAssessmentService _assessmentService;
|
||||
|
||||
[HttpPost("pronunciation-assessment")]
|
||||
public async Task<IActionResult> EvaluatePronunciation(
|
||||
[FromForm] IFormFile audio,
|
||||
[FromForm] string referenceText,
|
||||
[FromForm] string flashcardId,
|
||||
[FromForm] string language = "en-US")
|
||||
{
|
||||
// 1. 驗證請求
|
||||
if (audio == null || audio.Length == 0)
|
||||
return BadRequest("音頻檔案不能為空");
|
||||
|
||||
if (audio.Length > 10 * 1024 * 1024) // 10MB 限制
|
||||
return BadRequest("音頻檔案過大");
|
||||
|
||||
// 2. 處理音頻流
|
||||
using var audioStream = audio.OpenReadStream();
|
||||
|
||||
// 3. 呼叫 Azure Speech Services
|
||||
var result = await _assessmentService.EvaluatePronunciationAsync(
|
||||
audioStream, referenceText, language);
|
||||
|
||||
// 4. 儲存評估記錄到資料庫
|
||||
// 5. 返回結果
|
||||
return Ok(result);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. **Azure Speech Services 整合**
|
||||
```csharp
|
||||
public class AzurePronunciationAssessmentService : IPronunciationAssessmentService
|
||||
{
|
||||
public async Task<PronunciationResult> EvaluatePronunciationAsync(
|
||||
Stream audioStream, string referenceText, string language)
|
||||
{
|
||||
// 1. 設定 Azure Speech Config
|
||||
var speechConfig = SpeechConfig.FromSubscription(
|
||||
_options.SubscriptionKey,
|
||||
_options.Region
|
||||
);
|
||||
speechConfig.SpeechRecognitionLanguage = language;
|
||||
|
||||
// 2. 設定發音評估參數
|
||||
var pronunciationConfig = PronunciationAssessmentConfig.Create(
|
||||
referenceText,
|
||||
GradingSystem.HundredMark,
|
||||
Granularity.Word, // 單詞級別評估
|
||||
enableMiscue: true // 啟用錯誤檢測
|
||||
);
|
||||
|
||||
// 3. 設定音頻配置
|
||||
using var audioConfig = AudioConfig.FromStreamInput(
|
||||
AudioInputStream.CreatePushStream()
|
||||
);
|
||||
|
||||
// 4. 建立語音識別器
|
||||
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
|
||||
pronunciationConfig.ApplyTo(recognizer);
|
||||
|
||||
// 5. 處理音頻並獲取結果
|
||||
var result = await recognizer.RecognizeOnceAsync();
|
||||
|
||||
// 6. 解析評估結果
|
||||
var pronunciationResult = PronunciationAssessmentResult.FromResult(result);
|
||||
|
||||
// 7. 轉換為系統格式
|
||||
return new PronunciationResult
|
||||
{
|
||||
OverallScore = pronunciationResult.AccuracyScore,
|
||||
AccuracyScore = pronunciationResult.AccuracyScore,
|
||||
FluencyScore = pronunciationResult.FluencyScore,
|
||||
CompletenessScore = pronunciationResult.CompletenessScore,
|
||||
ProsodyScore = pronunciationResult.ProsodyScore,
|
||||
TranscribedText = result.Text,
|
||||
ProcessingTime = stopwatch.ElapsedMilliseconds
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🌍 環境配置規格
|
||||
|
||||
### appsettings.json 配置
|
||||
```json
|
||||
{
|
||||
"AzureSpeech": {
|
||||
"SubscriptionKey": "${AZURE_SPEECH_KEY}",
|
||||
"Region": "eastus",
|
||||
"Language": "en-US",
|
||||
"EnableDetailedResult": true,
|
||||
"TimeoutSeconds": 30,
|
||||
"MaxAudioSizeMB": 10,
|
||||
"SupportedFormats": ["audio/wav", "audio/webm", "audio/mp3"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 環境變數
|
||||
```bash
|
||||
# 開發環境
|
||||
AZURE_SPEECH_KEY=your_azure_speech_key_here
|
||||
AZURE_SPEECH_REGION=eastus
|
||||
|
||||
# 生產環境 (使用 Azure Key Vault)
|
||||
AZURE_SPEECH_KEY_VAULT_URL=https://dramaling-vault.vault.azure.net/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📱 複習系統整合
|
||||
|
||||
### 1. Quiz Type 擴展
|
||||
|
||||
**更新位置**: `hooks/review/useReviewSession.ts`
|
||||
|
||||
```typescript
|
||||
// 類型定義更新
|
||||
interface QuizItem {
|
||||
quizType: 'flip-card' | 'vocab-choice' | 'sentence-speaking'
|
||||
}
|
||||
|
||||
// 生成邏輯擴展 (Line 110-132)
|
||||
quizItems.push(
|
||||
// 現有題目類型...
|
||||
{
|
||||
id: `${card.id}-sentence-speaking`,
|
||||
cardId: card.id,
|
||||
cardData: cardState,
|
||||
quizType: 'sentence-speaking',
|
||||
order: order++,
|
||||
isCompleted: false,
|
||||
wrongCount: 0,
|
||||
skipCount: 0
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### 2. 渲染邏輯擴展
|
||||
|
||||
**更新位置**: `app/review/page.tsx` (Line 332-350)
|
||||
|
||||
```typescript
|
||||
// 添加新的條件渲染
|
||||
{currentQuizItem.quizType === 'sentence-speaking' && (
|
||||
<SentenceSpeakingQuiz
|
||||
card={currentCard}
|
||||
onAnswer={handleAnswer}
|
||||
onSkip={handleSkip}
|
||||
/>
|
||||
)}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎨 用戶介面設計
|
||||
|
||||
### 錄音狀態 UI
|
||||
|
||||
#### **錄音前**
|
||||
```html
|
||||
<button class="bg-red-500 hover:bg-red-600">
|
||||
🎤 開始錄音
|
||||
</button>
|
||||
<p class="text-gray-600">點擊開始錄製例句發音</p>
|
||||
```
|
||||
|
||||
#### **錄音中**
|
||||
```html
|
||||
<button class="bg-red-600 animate-pulse">
|
||||
⏹️ 停止錄音
|
||||
</button>
|
||||
<div class="flex items-center gap-2">
|
||||
<div class="w-2 h-2 bg-red-500 rounded-full animate-ping"></div>
|
||||
<span>錄音中... {recordingTime}s</span>
|
||||
</div>
|
||||
```
|
||||
|
||||
#### **處理中**
|
||||
```html
|
||||
<div class="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600"></div>
|
||||
<p>AI 正在評估發音... (約需 2-3 秒)</p>
|
||||
```
|
||||
|
||||
#### **結果顯示**
|
||||
```html
|
||||
<div class="bg-blue-50 border border-blue-200 rounded-lg p-6">
|
||||
<h4 class="font-semibold text-blue-900 mb-3">發音評估結果</h4>
|
||||
|
||||
<!-- 總分顯示 -->
|
||||
<div class="flex items-center gap-3 mb-4">
|
||||
<div class="text-3xl font-bold text-blue-600">{overallScore}</div>
|
||||
<div class="text-gray-600">總分 (滿分 100)</div>
|
||||
</div>
|
||||
|
||||
<!-- 詳細評分 -->
|
||||
<div class="grid grid-cols-2 gap-3 mb-4">
|
||||
<div class="bg-white p-3 rounded border">
|
||||
<div class="text-sm text-gray-600">準確度</div>
|
||||
<div class="font-semibold text-lg">{accuracyScore}</div>
|
||||
</div>
|
||||
<!-- 其他評分項目... -->
|
||||
</div>
|
||||
|
||||
<!-- 語音轉文字結果 -->
|
||||
<div class="bg-gray-50 p-3 rounded border mb-4">
|
||||
<div class="text-sm text-gray-600 mb-1">識別結果</div>
|
||||
<div class="font-mono text-sm">{transcribedText}</div>
|
||||
</div>
|
||||
|
||||
<!-- 改善建議 -->
|
||||
<div class="space-y-1">
|
||||
{feedback.map(item => (
|
||||
<div class="text-sm text-blue-700">• {item}</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔄 資料流程設計
|
||||
|
||||
### 完整流程
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[用戶點擊錄音] --> B[前端開始錄音]
|
||||
B --> C[用戶說完點擊停止]
|
||||
C --> D[前端生成音頻 Blob]
|
||||
D --> E[上傳到後端 API]
|
||||
E --> F[後端接收音頻檔案]
|
||||
F --> G[呼叫 Azure Speech Services]
|
||||
G --> H[Azure 返回評估結果]
|
||||
H --> I[儲存到資料庫]
|
||||
I --> J[返回評分給前端]
|
||||
J --> K[前端顯示結果]
|
||||
K --> L[映射到信心等級]
|
||||
L --> M[更新複習進度]
|
||||
```
|
||||
|
||||
### 錯誤處理流程
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[API 請求] --> B{驗證音頻}
|
||||
B -->|失敗| C[返回驗證錯誤]
|
||||
B -->|成功| D[呼叫 Azure API]
|
||||
D -->|成功| E[處理結果]
|
||||
D -->|失敗| F{錯誤類型}
|
||||
F -->|網路| G[返回重試提示]
|
||||
F -->|配額| H[返回配額錯誤]
|
||||
F -->|其他| I[返回一般錯誤]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 實施階段規劃
|
||||
|
||||
### 第一階段:基礎架構
|
||||
1. ✅ 後端 Azure Speech Services 整合
|
||||
2. ✅ 基礎 API 端點實現
|
||||
3. ✅ 資料庫 Schema 更新
|
||||
4. ✅ 環境配置設定
|
||||
|
||||
### 第二階段:前端整合
|
||||
1. ✅ AudioRecorder 共用組件開發
|
||||
2. ✅ SentenceSpeakingQuiz 組件重構
|
||||
3. ✅ API 服務客戶端實現
|
||||
4. ✅ 複習系統整合
|
||||
|
||||
### 第三階段:優化和測試
|
||||
1. ✅ 錄音品質優化
|
||||
2. ✅ 評分準確度調整
|
||||
3. ✅ 錯誤處理完善
|
||||
4. ✅ 效能和穩定性測試
|
||||
|
||||
---
|
||||
|
||||
## 🔧 開發工具和配置
|
||||
|
||||
### 開發環境需求
|
||||
- **Azure Speech Services 帳戶** (免費層每月 5,000 次請求)
|
||||
- **音頻測試環境** (需要麥克風的開發設備)
|
||||
- **HTTPS 環境** (Web Audio API 需要安全連接)
|
||||
|
||||
### 測試策略
|
||||
- **單元測試**: Azure 服務模擬
|
||||
- **整合測試**: 端對端音頻流程
|
||||
- **負載測試**: 併發請求處理
|
||||
- **用戶測試**: 真實發音評估準確性
|
||||
|
||||
### 部署考量
|
||||
- **音頻檔案暫存**: 處理後立即清理
|
||||
- **Azure 配額管理**: 監控使用量避免超限
|
||||
- **CDN 配置**: 靜態資源優化
|
||||
- **負載平衡**: 處理高併發錄音請求
|
||||
|
||||
---
|
||||
|
||||
## 📈 效能指標和監控
|
||||
|
||||
### 關鍵指標
|
||||
- **評估延遲**: 目標 < 3 秒
|
||||
- **準確率**: 與人工評估比較 > 85%
|
||||
- **成功率**: API 請求成功率 > 99%
|
||||
- **用戶滿意度**: 發音改善效果追蹤
|
||||
|
||||
### 監控項目
|
||||
- Azure API 請求次數和耗時
|
||||
- 音頻檔案大小分佈
|
||||
- 評分分佈統計
|
||||
- 錯誤類型統計
|
||||
|
||||
---
|
||||
|
||||
## 💰 成本估算
|
||||
|
||||
### Azure Speech Services 定價 (2024)
|
||||
- **免費層**: 每月 5,000 次請求
|
||||
- **標準層**: $1 USD / 1,000 次請求
|
||||
- **預估使用**: 100 用戶 × 10 次/日 = 30,000 次/月
|
||||
- **月成本**: ~$30 USD (超出免費額度部分)
|
||||
|
||||
### 建議成本控制
|
||||
- 實施請求快取避免重複評估
|
||||
- 設定用戶每日使用限額
|
||||
- 監控異常使用模式
|
||||
|
||||
---
|
||||
|
||||
## 🔐 安全性規格
|
||||
|
||||
### 音頻資料保護
|
||||
- **傳輸加密**: HTTPS/TLS 1.3
|
||||
- **暫存清理**: 處理完成後立即刪除音頻檔案
|
||||
- **存取控制**: 僅評估用戶自己的錄音
|
||||
|
||||
### API 安全
|
||||
- **速率限制**: 每用戶每分鐘最多 10 次請求
|
||||
- **檔案驗證**: 檢查音頻格式和內容
|
||||
- **輸入清理**: 防止注入攻擊
|
||||
|
||||
---
|
||||
|
||||
## 📚 技術參考資料
|
||||
|
||||
### Microsoft 官方文檔
|
||||
- [Azure Speech Services Pronunciation Assessment](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-pronunciation-assessment)
|
||||
- [Speech SDK for C#](https://learn.microsoft.com/en-us/dotnet/api/microsoft.cognitiveservices.speech)
|
||||
- [Interactive Language Learning Tutorial](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-learning-with-pronunciation-assessment)
|
||||
|
||||
### 實作範例
|
||||
- [GitHub Azure Speech Samples](https://github.com/Azure-Samples/cognitive-services-speech-sdk)
|
||||
- [Pronunciation Assessment Samples](https://github.com/Azure-Samples/azure-ai-speech/tree/main/pronunciation-assessment)
|
||||
|
||||
---
|
||||
|
||||
## ✅ 驗收標準
|
||||
|
||||
### 功能驗收
|
||||
1. ✅ 用戶能成功錄製 1-30 秒的音頻
|
||||
2. ✅ 後端能準確評估發音並返回多維度評分
|
||||
3. ✅ 前端能清晰顯示評分結果和改善建議
|
||||
4. ✅ 評分能正確映射到複習系統的信心等級
|
||||
|
||||
### 效能驗收
|
||||
1. ✅ 音頻處理延遲 < 5 秒
|
||||
2. ✅ API 回應時間 < 10 秒 (包含網路延遲)
|
||||
3. ✅ 系統能處理併發錄音請求
|
||||
4. ✅ 無記憶體洩漏或音頻檔案堆積
|
||||
|
||||
### 用戶體驗驗收
|
||||
1. ✅ 錄音過程直觀易懂
|
||||
2. ✅ 評分結果有意義且具建設性
|
||||
3. ✅ 錯誤提示清晰有幫助
|
||||
4. ✅ 與現有複習流程無縫整合
|
||||
|
||||
這個規格將為 DramaLing 增加強大的口說練習功能,提升學習者的發音能力和語言實際應用技能!
|
||||
Loading…
Reference in New Issue