小小优化，不成敬意

2026-03-09 10:21:33 +08:00
parent dc9e4bd0ef
commit 237c96f629
19 changed files with 243 additions and 322 deletions
--- a/analysis_output/analysis_report.md
+++ b/analysis_output/analysis_report.md
@@ -1,172 +1,135 @@
-<!-- 
-  Generated: 2026-03-09 09:40:55
-  Data: cleaned_data.csv (84 rows x 21 cols)
-  Quality: 82.0/100
-  Template: templates/iot_ops_report.md
-  AI never accessed raw data rows - only aggregated tool results
-->
+# 工单分析报告

-# 《XX品牌车联网运维分析报告》
-
-## 1. 整体问题分布与效率分析
-
-### 1.1 工单类型分布与趋势
-
-总工单数84单。  
-其中：
- TSP问题：1单 (1.19%)
- APP问题：5单 (5.95%)
- TBOX问题：16单 (19.05%)
- 咨询类：45单 (53.57%)
- 其他：17单 (20.24%)
-
-> （可增加环比变化趋势）
+生成时间：2026-03-09 10:10:08
+数据源：cleaned_data.csv

 ---

-### 1.2 问题解决效率分析
+# 工单数据分析报告

-> （后续可增加环比变化趋势，如工单总流转时间、环比增长趋势图）
+## 1. 执行摘要

-| 工单类型 | 总数量 | 一线处理数量 | 反馈二线数量 | 平均时长(h) | 中位数(h) | 一次解决率(%) | TSP处理次数 |
-| --- | --- | --- | --- | --- | --- | --- | --- |
-| TSP问题 | 1 |     |     | 216 | 216 |     |     |
-| APP问题 | 5 |     |     | 354 | 354 |     |     |
-| TBOX问题 | 16 |     |     | 2140.5 | 2140.5 |     |     |
-| 咨询类 | 45 |     |     | 1224.528 | 984 |     |     |
-| 合计  | 67 |     |     |     |     |     |     |
+基于对84条工单数据的全面分析，我们发现了以下关键洞察：
+
+1. **问题类型高度集中**：Remote control 问题占比高达 **66.67%**（56/84），是主要问题来源，建议优先优化相关系统。
+2. **处理效率差异显著**：平均关闭时长为 **54.77天**，但责任人间差异巨大，Vsevolod Tsoi 的平均处理时间为 **152天**，而刘康男仅为 **2天**。
+3. **车型问题聚焦**：EXEED RX（T22）以 **38个工单** 占据首位，JAECOO J7（T1EJ）以 **22个工单** 次之，这两款车型是问题高发区。
+4. **工单状态需关注**：临时关闭工单占比 **17.9%**（15/84），其平均关闭时长（**80.2天**）显著高于已关闭工单（**49.25天**）。
+5. **数据质量存在异常**：关闭时长列发现 **2个异常值**（277天和237天），占数据的 **2.38%**，表明存在极端处理时间。
+
+## 2. 数据概览
+
+- **数据类型**：Ticket（工单）
+- **数据规模**：84行 × 21列
+- **数据质量分数**：88.0/100
+- **关键字段**：工单号、来源、创建日期、问题类型、问题描述、处理过程、跟踪记录、严重程度、工单状态、模块、责任人、关闭日期、车型、VIN、关闭时长(天)等
+- **分析时间范围**：2025年1月2日至2025年2月（基于创建日期）
+
+## 3. 详细分析
+
+### 3.1 工单数量与趋势分析
+
+- **总体趋势**：2025年1月工单创建总数为 **62个**，平均每天约 **2个**；2025年2月工单创建总数为 **27个**，相比1月下降约 **56%**，表明工单创建活动减少。
+- **峰值日期**：1月13日创建数量最高（**8个**），1月上旬（1月2日至1月6日）共创建 **15个** 工单，为高峰期。
+- **分布特点**：2月工单创建量整体较低且分布分散，可能与业务活动或系统稳定性相关。
+
+### 3.2 问题类型与模块分布分析
+
+- **问题类型分布**：
+  - Remote control：**56个**（**66.67%**），为主要问题来源。
+  - Network：**6个**（**7.14%**），为第二常见问题类型。
+  - Navi：**5个**（**5.95%**）。
+  - Application：**4个**（**4.76%**）。
+- **模块分布**：
+  - local O&M：**45个**（**53.57%**），问题最集中的模块。
+  - TBOX：**16个**（**19.05%**），第二大问题模块。
+- **洞察**：Remote control 问题与 local O&M 模块高度相关，建议优先排查该模块的远程控制功能。
+
+### 3.3 严重程度与状态分析
+
+- **严重程度分布**：
+  - Low：**75个**（**89.3%**）。
+  - Medium：**9个**（**10.7%**），需重点关注这9个高影响问题。
+- **工单状态分布**：
+  - 已关闭（close）：**69个**（**82.1%**）。
+  - 临时关闭（temporary close）：**15个**（**17.9%**），需优先处理。
+- **关联分析**：临时关闭工单的平均关闭时长为 **80.2天**，显著高于已关闭工单的 **49.25天**，表明临时关闭状态可能延长处理时间。
+
+### 3.4 处理效率与责任人分析
+
+- **关闭时长统计**：
+  - 平均值：**54.77天**，中位数：**41天**。
+  - 分布右偏（偏度 **1.92**），标准差 **48.19天**，表明存在长尾效应。
+- **责任人效率差异**：
+  - Vsevolod Tsoi：平均 **152天**（最高）。
+  - 刘康男：平均 **2天**（最低）。
+  - 其他责任人：Evgeniy（**62.39天**）、Kostya（**26.6天**）、Vadim（**62.39天**）。
+- **来源渠道效率**：
+  - Mail：平均 **60.35天**（最长）。
+  - Telegram channel：平均 **16.5天**（最短）。
+- **洞察**：邮件处理流程可能存在瓶颈，建议优化；责任人效率差异需通过培训或资源调配改善。
+
+### 3.5 车辆特定问题分析
+
+- **车型分布**：
+  - EXEED RX（T22）：**38个**（占比最高）。
+  - JAECOO J7（T1EJ）：**22个**。
+  - EXEED VX FL（M36T）：**17个**。
+  - CHERRY TIGGO 9 (T28)：**7个**。
+- **VIN重复情况**：
+  - LVTDD24B1RG023450 和 LVTDD24B1RG021245 各出现 **2次**，其他VIN均出现1次，表明个别车辆多次报修。
+- **洞察**：EXEED RX（T22）和 JAECOO J7（T1EJ）是问题高发车型，建议针对这些车型开展专项排查。
+
+### 3.6 异常值检测与数据质量检查
+
+- **关闭时长异常值**：
+  - 发现 **2个异常值**：277天和237天，占数据的 **2.38%**。
+  - IQR上界为 **171.88天**，异常值远超此范围。
+- **数据分布特征**：
+  - 右偏明显（偏度 **1.92**），均值高于中位数，标准差较大。
+- **建议**：需核查异常工单的处理记录，避免极端值影响分析准确性。
+
+## 4. 结论与建议
+
+### 结论
+1. 工单问题高度集中于 Remote control 和 local O&M 模块，需优先优化。
+2. 处理效率差异显著，责任人与渠道间存在明显瓶颈。
+3. 车型问题聚焦于 EXEED RX（T22）和 JAECOO J7（T1EJ），建议专项治理。
+4. 临时关闭工单处理时间较长，需加强跟踪与重新评估。
+5. 数据质量存在异常值，需进一步核查。
+
+### 可操作建议
+1. **优化 Remote control 系统**：针对占比 **66.67%** 的 Remote control 问题，开展根因分析并优化相关功能，减少工单生成。
+2. **提升处理效率**：
+   - 针对责任人效率差异，组织培训或调整资源分配，重点关注 Vsevolod Tsoi 的处理流程。
+   - 优化邮件处理流程（平均 **60.35天**），引入自动化工具或增加人力。
+3. **聚焦高发车型**：针对 EXEED RX（T22）和 JAECOO J7（T1EJ）开展专项排查，制定预防性维护计划。
+4. **处理临时关闭工单**：优先重新评估 **15个** 临时关闭工单，目标将平均关闭时长从 **80.2天** 降低至接近已关闭工单水平（**49.25天**）。
+5. **数据质量改进**：核查 **2个异常值**（277天和237天）的工单记录，确保数据准确性，并建立异常值监控机制。
+
+通过执行这些建议，可显著提升工单处理效率、减少问题复发，并优化资源分配。

 ---

-### 1.3 问题车型分布
+## 分析追溯

-| 车型 | 数量 | 占比 | 平均关闭时长(天) | 平均关闭时长(h) |
-| --- | --- | --- | --- | --- |
-| EXEED RX（T22） | 38 | 45.24% | 58.05 | 1393.2 |
-| JAECOO J7（T1EJ） | 22 | 26.19% | 53.59 | 1286.16 |
-| EXEED VX FL（M36T） | 17 | 20.24% | 39.12 | 938.88 |
-| CHERY TIGGO 9 (T28)） | 7 | 8.33% | 78.71 | 1889.04 |
+本报告基于以下分析任务：

---
-
-## 2. 各类问题专题分析
-
-### 2.1 TSP问题专题
-
-当月总体情况概述：
-
-| 工单类型 | 总数量 | 海外一线处理数量 | 国内二线数量 | 平均时长(h) | 中位数(h) |
-| --- | --- | --- | --- | --- | --- |
-| TSP问题 | 1 |     |     | 216 | 216 |
-
-#### 2.1.1 TSP问题二级分类+三级分布
-
-本期无数据
-
-#### 2.1.2 TOP问题
-
-| 高频问题简述 | 关键词示例 | 原因  | 处理方式 | 占比约 |
-| --- | --- | --- | --- | --- |
-| 本期无数据 |     |     |     |     |
-
-> 聚类分析文件（需要输出）：[4-1TSP问题聚类.xlsx]
-
---
-
-### 2.2 APP问题专题
-
-当月总体情况概述：
-
-| 工单类型 | 总数量 | 一线处理数量 | 反馈二线数量 | 一线平均处理时长(h) | 二线平均处理时长(h) | 平均时长(h) | 中位数(h) |
-| --- | --- | --- | --- | --- | --- | --- | --- |
-| APP问题 | 5 |     |     |     |     | 354 | 354 |
-
-#### 2.2.1 APP问题二级分类分布
-
-| 问题类型 | 数量 | 占比 | 平均关闭时长(天) | 平均关闭时长(h) |
-| --- | --- | --- | --- | --- |
-| Application | 4 | 4.76% | 14.75 | 354 |
-| Problem with auth in member center | 3 | 3.57% | 24.67 | 592.08 |
-
-#### 2.2.2 TOP问题
-
-| 高频问题简述 | 关键词示例 | 原因  | 处理方式 | 数量  | 占比约 |
-| --- | --- | --- | --- | --- | --- |
-| Application问题 | Application |     |     | 4 | 4.76% |
-| 会员中心认证问题 | Problem with auth in member center |     |     | 3 | 3.57% |
-
-> 聚类分析文件（需要输出）：[4-2APP问题聚类.xlsx]
-
---
-
-### 2.3 TBOX问题专题
-
-> 总流转时间和环比增长趋势（可参考柱状+折线组合图）
-
-#### 2.3.1 TBOX问题二级分类分布
-
-| 问题类型 | 数量 | 占比 | 平均关闭时长(天) | 平均关闭时长(h) |
-| --- | --- | --- | --- | --- |
-| Remote control | 56 | 66.67% | 66.5 | 1596 |
-| Network | 6 | 7.14% | 24 | 576 |
-| Activation SIM | 2 | 2.38% | 142.5 | 3420 |
-
-#### 2.3.2 TOP问题
-
-| 高频问题简述 | 关键词示例 | 原因  | 处理方式 | 占比约 |
-| --- | --- | --- | --- | --- |
-| 远程控制问题 | Remote control |     |     | 66.67% |
-| 网络问题 | Network |     |     | 7.14% |
-| SIM激活问题 | Activation SIM |     |     | 2.38% |
-
-> 聚类分析文件：[4-3TBOX问题聚类.xlsx]
-
---
-
-### 2.4 DMC专题
-
-> 总流转时间和环比增长趋势（可参考柱状+折线组合图）
-
-#### 2.4.1 DMC类二级分类分布与解决时长
-
-| 问题类型 | 数量 | 占比 | 平均关闭时长(天) | 平均关闭时长(h) |
-| --- | --- | --- | --- | --- |
-| DMC模块问题 | 1 | 1.19% | 40 | 960 |
-
-#### 2.4.2 TOP问题
-
-| 高频问题简述 | 关键词示例 | 原因  | 处理方式 | 占比约 |
-| --- | --- | --- | --- | --- |
-| DMC模块问题 | DMC |     |     | 1.19% |
-
-> 聚类分析文件（需要输出）：[4-4DMC问题处理.xlsx]
-
---
-
-### 2.5 咨询类专题
-
-> 总流转时间和环比增长趋势（可参考柱状+折线组合图）
-
-#### 2.5.1 咨询类二级分类分布与解决时长
-
-| 问题类型 | 数量 | 占比 | 平均关闭时长(天) | 平均关闭时长(h) |
-| --- | --- | --- | --- | --- |
-| local O&M | 45 | 53.57% | 51.02 | 1224.48 |
-
-#### 2.5.2 TOP咨询
-
-| 高频问题简述 | 关键词示例 | 原因  | 处理方式 | 占比约 |
-| --- | --- | --- | --- | --- |
-| 本地运维问题 | local O&M |     |     | 53.57% |
-
-> 咨询类文件（需要输出）：[4-5咨询类问题处理.xlsx]
-
---
-
-## 3. 建议与附件
-
- 工单客诉详情见附件：
- 数据质量分数：82.0/100
- 关闭时长异常值：2个（277天、237天），占比2.38%
- 责任人分布：Vsevolod处理31单（37.80%），Evgeniy处理28单（34.15%）
- 来源分布：邮件46单（54.76%），Telegram bot 36单（42.86%）
+- ✓ 工单数量与趋势分析
+  - 2025年1月工单创建总数为62个，平均每天约2个，其中1月13日创建数量最高（8个）。
+  - 2025年2月工单创建总数为27个，相比1月下降约56%，表明工单创建活动减少。
+- ✓ 严重程度与状态分析
+  - 工单严重程度分布：Low 严重程度工单占比 89.3%（75/84），Medium 严重程度工单占比 10.7%（9/84），需关注 Medium 严重程度的 9 个高影响问题。
+  - 工单状态分布：已关闭（close）工单占比 82.1%（69/84），临时关闭（temporary close）工单占比 17.9%（15/84），需优先处理 15 个临时关闭工单。
+- ✓ 问题类型与模块分布分析
+  - 问题类型中，'Remote control'占比最高，达66.67%（56/84），是主要问题来源。
+  - 模块分布中，'local O&M'占比53.57%（45/84），是问题最集中的模块。
+- ✓ 处理效率与责任人分析
+  - 平均关闭时长为54.77天，中位数为41天，分布右偏（偏度1.92），表明部分工单处理时间较长，存在效率瓶颈。
+  - 按责任人分组，Vsevolod Tsoi的平均关闭时长最高（152天），而刘康男最低（2天），显示处理效率差异显著。
+- ✓ 异常值检测与数据质量检查
+  - 关闭时长(天)列存在2个异常值（277天和237天），占总数据的2.38%，远高于IQR上界171.88天，表明数据质量存在极端值问题。
+  - 数据右偏明显（偏度1.92），均值54.77天高于中位数41天，标准差48.19天，显示关闭时长分布不均匀，多数工单关闭较快但存在长尾。
+- ✓ 车辆特定问题分析
+  - 车型工单分布显示：EXEED RX（T22）以38个工单居首，占总工单的显著比例，JAECOO J7（T1EJ）以22个工单次之，表明这两款车型是问题高发区。
+  - VIN值计数中，LVTDD24B1RG023450和LVTDD24B1RG021245各出现2次，其他VIN均出现1次，说明VIN重复率低，但存在个别车辆多次报修的情况。
--- a/bar_chart_responsible.png
+++ b/bar_chart_responsible.png
--- a/bar_chart_source.png
+++ b/bar_chart_source.png
--- a/issue_type_bar_chart.png
+++ b/issue_type_bar_chart.png
--- a/module_bar_chart.png
+++ b/module_bar_chart.png
--- a/run_analysis_en.py
+++ b/run_analysis_en.py
@@ -63,13 +63,23 @@ def run_analysis(
    """
    Run the full AI-driven analysis pipeline.
    
+    Each run creates a timestamped subdirectory under output_dir:
+      output_dir/run_20260309_143025/
+        ├── analysis_report.md
+        └── charts/
+            ├── bar_chart.png
+            └── ...
+    
    Args:
        data_file: Path to any CSV file
        user_requirement: Natural language requirement (optional)
        template_file: Report template path (optional)
-        output_dir: Output directory
+        output_dir: Base output directory
    """
-    os.makedirs(output_dir, exist_ok=True)
+    # 每次运行创建带时间戳的子目录
+    run_timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
+    run_dir = os.path.join(output_dir, f"run_{run_timestamp}")
+    os.makedirs(run_dir, exist_ok=True)
    config = get_config()

    print("\n" + "=" * 70)
@@ -77,6 +87,7 @@ def run_analysis(
    print("=" * 70)
    print(f"Start: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"Data:  {data_file}")
+    print(f"Output: {run_dir}")
    if template_file:
        print(f"Template: {template_file}")
    print("=" * 70)
@@ -117,6 +128,7 @@ def run_analysis(
    # ── Stage 4: AI Task Execution ──
    print("\n[4/5] AI Executing Tasks...")
    # Reuse DAL from Stage 1 — no need to load data again
+    dal.set_output_dir(run_dir)
    results: List[AnalysisResult] = []

    sorted_tasks = sorted(analysis_plan.tasks, key=lambda t: t.priority, reverse=True)
@@ -137,12 +149,12 @@ def run_analysis(

    # ── Stage 5: Report Generation ──
    print("\n[5/5] Generating Report...")
-    report_path = os.path.join(output_dir, "analysis_report.md")
+    report_path = os.path.join(run_dir, "analysis_report.md")

    if template_file and os.path.exists(template_file):
-        report = _generate_template_report(profile, results, template_file, config)
+        report = _generate_template_report(profile, results, template_file, config, run_dir)
    else:
-        report = generate_report(results, requirement, profile)
+        report = generate_report(results, requirement, profile, output_path=run_dir)

    # Save report
    with open(report_path, 'w', encoding='utf-8') as f:
@@ -155,7 +167,7 @@ def run_analysis(
    print("\n" + "=" * 70)
    print("Analysis Complete!")
    print(f"End: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
-    print(f"Output: {report_path}")
+    print(f"Output: {run_dir}")
    print("=" * 70)

    return True
@@ -165,7 +177,8 @@ def _generate_template_report(
    profile: DataProfile,
    results: List[AnalysisResult],
    template_path: str,
-    config
+    config,
+    run_dir: str = ""
 ) -> str:
    """Use AI to fill a template with data from task execution results."""
    client = OpenAI(api_key=config.llm.api_key, base_url=config.llm.base_url)
@@ -210,12 +223,17 @@ def _generate_template_report(
 {template}
 ```

+## 图表文件
+以下是分析过程中生成的图表文件，请在报告适当位置嵌入：
+{_collect_chart_paths(results, run_dir)}
+
 ## 要求
 1. 用实际数据填充模板中所有占位符
 2. 根据数据中的字段，智能映射到模板分类
 3. 所有数字必须来自分析结果，不要编造
 4. 如果某个模板分类在数据中没有对应，标注"本期无数据"
 5. 保持Markdown格式
+6. 在报告中嵌入图表，使用 ![描述](图表路径) 格式，让报告图文结合
 """

    print("  AI filling template with analysis results...")
@@ -243,6 +261,28 @@ def _generate_template_report(
    return header + report


+def _collect_chart_paths(results: List[AnalysisResult], run_dir: str = "") -> str:
+    """Collect all chart paths from task results for embedding in reports."""
+    paths = []
+    for r in results:
+        if not r.success:
+            continue
+        # From visualizations list
+        for viz in (r.visualizations or []):
+            if viz and viz not in paths:
+                paths.append(viz)
+        # From data dict (chart_path in tool results)
+        if isinstance(r.data, dict):
+            for key, val in r.data.items():
+                if isinstance(val, dict) and val.get('chart_path'):
+                    cp = val['chart_path']
+                    if cp not in paths:
+                        paths.append(cp)
+    if not paths:
+        return "(无图表)"
+    return "\n".join(f"- {p}" for p in paths)
+
+
 if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description="AI-Driven Data Analysis")
--- a/severity_pie_chart.png
+++ b/severity_pie_chart.png
--- a/src/pycache/data_access.cpython-311.pyc
+++ b/src/pycache/data_access.cpython-311.pyc
--- a/src/data_access.py
+++ b/src/data_access.py
@@ -35,6 +35,7 @@ class DataAccessLayer:
        """
        self._data = data  # 私有数据，AI 不可访问
        self._file_path = file_path
+        self._output_dir = ""  # 输出目录，用于图表等文件
        
    @classmethod
    def load_from_file(cls, file_path: str, max_retries: int = 3, optimize_memory: bool = True) -> 'DataAccessLayer':
@@ -205,10 +206,21 @@ class DataAccessLayer:
        # 默认为文本类型
        return 'text'
    
+    def set_output_dir(self, output_dir: str):
+        """
+        设置输出目录，图表等文件将保存到此目录下。
+        
+        参数：
+            output_dir: 输出目录路径
+        """
+        self._output_dir = output_dir
+    
    def execute_tool(self, tool: Any, **kwargs) -> Dict[str, Any]:
        """
        执行工具并返回聚合结果（安全）。
        
+        如果设置了 output_dir，图表文件会自动保存到 output_dir/charts/ 下。
+        
        参数：
            tool: 分析工具实例
            **kwargs: 工具参数
@@ -217,6 +229,10 @@ class DataAccessLayer:
            工具执行结果（聚合数据）
        """
        try:
+            # 如果设置了输出目录，自动修正图表输出路径
+            if self._output_dir:
+                kwargs = self._fix_output_path(tool, kwargs)
+            
            result = tool.execute(self._data, **kwargs)
            return self._sanitize_result(result)
        except Exception as e:
@@ -227,6 +243,37 @@ class DataAccessLayer:
                'tool': tool.name
            }
    
+    def _fix_output_path(self, tool: Any, kwargs: dict) -> dict:
+        """
+        确保图表输出路径指向 output_dir/charts/ 目录。
+        
+        参数：
+            tool: 工具实例
+            kwargs: 工具参数
+        
+        返回：
+            修正后的参数
+        """
+        # 检查工具是否有 output_path 参数
+        props = getattr(tool, 'parameters', {}).get('properties', {})
+        if 'output_path' not in props:
+            return kwargs
+        
+        charts_dir = str(Path(self._output_dir) / "charts")
+        
+        if 'output_path' in kwargs:
+            # AI 指定了路径，但可能是相对路径如 "bar_chart.png"
+            output_path = kwargs['output_path']
+            if not Path(output_path).is_absolute() and not output_path.startswith(self._output_dir):
+                kwargs['output_path'] = str(Path(charts_dir) / Path(output_path).name)
+        else:
+            # AI 没指定路径，使用默认值但重定向到 charts 目录
+            default_path = props['output_path'].get('default', '')
+            if default_path:
+                kwargs['output_path'] = str(Path(charts_dir) / default_path)
+        
+        return kwargs
+    
    def _sanitize_result(self, result: Dict[str, Any]) -> Dict[str, Any]:
        """
        确保结果不包含原始数据，只返回聚合数据。
--- a/src/engines/report_generation.py
+++ b/src/engines/report_generation.py
@@ -365,8 +365,8 @@ def generate_report(
            results, key_findings, structure, requirement, data_profile
        )
    
-    # 保存报告
-    if output_path:
+    # 保存报告（仅当 output_path 指向文件时）
+    if output_path and not os.path.isdir(output_path):
        with open(output_path, 'w', encoding='utf-8') as f:
            f.write(report)
    
@@ -386,13 +386,27 @@ def _generate_report_with_ai(
    
    # 构建分析数据摘要（从results中提取实际数据）
    data_summaries = []
+    all_chart_paths = []
    for r in results:
        if r.success and r.data:
-            data_str = json.dumps(r.data, ensure_ascii=False, default=str)[:500]
+            data_str = json.dumps(r.data, ensure_ascii=False, default=str)[:1500]
            data_summaries.append(f"### {r.task_name}\n{data_str}")
+        # 收集所有图表路径
+        for viz in (r.visualizations or []):
+            if viz:
+                all_chart_paths.append(viz)
+        if isinstance(r.data, dict):
+            for key, val in r.data.items():
+                if isinstance(val, dict) and val.get('chart_path'):
+                    all_chart_paths.append(val['chart_path'])
    
    data_section = "\n\n".join(data_summaries) if data_summaries else "无详细数据"
    
+    # 图表路径列表
+    charts_section = ""
+    if all_chart_paths:
+        charts_section = "\n可用图表文件（请在报告中嵌入）：\n" + "\n".join(f"- {p}" for p in all_chart_paths)
+    
    # 构建提示
    prompt = f"""你是一位专业的数据分析师，需要根据分析结果生成一份完整的分析报告。

@@ -419,6 +433,7 @@ def _generate_report_with_ai(

 已完成的分析任务：
 {chr(10).join(f"- {r.task_name}: {'成功' if r.success else '失败'}, 洞察: {'; '.join(r.insights[:3])}" for r in results)}
+{charts_section}

 请生成一份专业的Markdown分析报告，包含：

@@ -433,6 +448,7 @@ def _generate_report_with_ai(
 - 提供可操作的建议
 - 使用清晰的结构和标题
 - 用中文撰写
+- 重要：在报告中嵌入图表，使用 ![描述](图表路径) 格式。将图表放在相关分析段落旁边，让报告图文结合。每个图表都要嵌入，不要遗漏。
 """
    
    try:
--- a/src/engines/task_execution.py
+++ b/src/engines/task_execution.py
@@ -171,7 +171,8 @@ Instructions:
 1. Pick the most relevant tool and call it with correct column names.
 2. After each observation, decide if you need more data or can conclude.
 3. Aim for 2-4 tool calls total to gather enough data.
-4. When you have enough data, set is_completed=true and summarize findings in reasoning.
+4. IMPORTANT: For key findings, also generate visualizations (charts) using create_bar_chart, create_pie_chart, create_line_chart, or create_heatmap. The report needs charts embedded — text-only results are not enough.
+5. When you have enough data AND have generated at least one chart, set is_completed=true and summarize findings in reasoning.

 Respond ONLY with this JSON (no other text):
 {{
--- a/src/main.py
+++ b/src/main.py
@@ -223,6 +223,9 @@ class AnalysisOrchestrator:
        logger.info(f"加载数据文件: {self.data_file}")
        data_profile, self.data_access = ai_understand_data_with_dal(self.data_file)
        
+        # 设置输出目录，确保图表等文件保存到正确位置
+        self.data_access.set_output_dir(str(self.output_dir))
+        
        logger.info(f"✓ 数据加载成功: {data_profile.row_count} 行, {data_profile.column_count} 列")
        logger.info(f"✓ 数据类型: {data_profile.inferred_type}")
        logger.info(f"✓ 数据质量分数: {data_profile.quality_score:.1f}/100")
--- a/src/tools/viz_tools.py
+++ b/src/tools/viz_tools.py
@@ -86,7 +86,7 @@ class CreateBarChartTool(AnalysisTool):
            # 准备数据
            if y_column:
                # 按 x_column 分组，对 y_column 求和
-                plot_data = data.groupby(x_column)[y_column].sum().sort_values(ascending=False).head(top_n)
+                plot_data = data.groupby(x_column, observed=True)[y_column].sum().sort_values(ascending=False).head(top_n)
            else:
                # 计数
                plot_data = data[x_column].value_counts().head(top_n)
--- a/start.bat
+++ b/start.bat
@@ -1,4 +0,0 @@
-@echo off
-echo Starting IOV Data Analysis Agent...
-python bootstrap.py
-pause
--- a/status_pie_chart.png
+++ b/status_pie_chart.png
--- a/test_results_summary.md
+++ b/test_results_summary.md
@@ -1,145 +0,0 @@
-# Test Results Summary - Task 22 Final Checkpoint
-
-## Overall Results
- **Total Tests**: 328
- **Passed**: 314 (95.7%)
- **Failed**: 14 (4.3%)
- **Execution Time**: 182.78s (3:02)
-
-## Failed Tests Analysis
-
-### 1. Property-Based Test Failures (3 tests)
-
-#### test_data_access_properties.py::test_data_profile_completeness
- **Issue**: `hypothesis.errors.FailedHealthCheck` - Generated inputs consumed too much entropy
- **Root Cause**: Data generation strategy creates too large datasets
- **Fix Needed**: Add `suppress_health_check=[HealthCheck.data_too_large]` to settings
-
-#### test_data_understanding_properties.py::test_data_type_inference
- **Issue**: `TypeError: understand_data() got an unexpected keyword argument 'file_path'`
- **Root Cause**: Function signature mismatch in test
- **Fix Needed**: Update test to match actual function signature
-
-#### test_data_understanding_properties.py::test_data_profile_completeness
- **Issue**: Same as above - `TypeError: understand_data() got an unexpected keyword argument 'file_path'`
- **Fix Needed**: Update test to match actual function signature
-
-#### test_tools_properties.py::test_tool_output_filtering
- **Issue**: `hypothesis.errors.FailedHealthCheck` - Generated inputs consumed too much entropy
- **Fix Needed**: Add `suppress_health_check=[HealthCheck.data_too_large]` to settings
-
-### 2. Integration Test Failures (7 tests)
-
-#### test_integration.py::TestEndToEndAnalysis (4 tests)
- **Issue**: `AssertionError: 分析失败: [Errno 13] Permission denied`
- **Root Cause**: Permission denied when accessing temp directory
- **Tests Affected**:
-  - test_complete_analysis_without_requirement
-  - test_analysis_with_requirement
-  - test_template_based_analysis
-  - test_different_data_types
- **Fix Needed**: Use proper temp directory with write permissions
-
-#### test_integration.py::TestOrchestrator::test_orchestrator_stages
- **Issue**: `assert None is not None`
- **Root Cause**: Orchestrator not returning expected result
- **Fix Needed**: Debug orchestrator implementation
-
-#### test_integration.py::TestProgressTracking::test_progress_callback
- **Issue**: `assert 4 == 5` - Progress callback not called expected number of times
- **Fix Needed**: Verify progress tracking implementation
-
-#### test_integration.py::TestOutputFiles::test_report_file_creation
- **Issue**: `assert False is True` - Report file not created
- **Root Cause**: Likely related to permission issues
- **Fix Needed**: Ensure proper file creation permissions
-
-### 3. Performance Test Failures (3 tests)
-
-#### test_performance.py::TestDataUnderstandingPerformance::test_large_dataset_performance
- **Issue**: `AssertionError: 大数据集理解耗时 30.44秒，超过30秒限制`
- **Root Cause**: Performance slightly exceeds 30-second threshold (30.44s)
- **Status**: Acceptable - only 0.44s over limit, within margin of error
-
-#### test_performance.py::TestFullAnalysisPerformance::test_small_dataset_full_analysis
- **Issue**: `assert False is True`
- **Root Cause**: Full analysis not completing successfully
- **Fix Needed**: Debug full analysis workflow
-
-#### test_performance.py::TestFullAnalysisPerformance::test_large_dataset_full_analysis
- **Issue**: `assert False is True`
- **Root Cause**: Full analysis not completing successfully
- **Fix Needed**: Debug full analysis workflow
-
-## Warnings Summary
-
-### Critical Warnings
-1. **DeprecationWarning**: `is_categorical_dtype` is deprecated
-   - Location: `src/engines/data_understanding.py:82`
-   - Fix: Use `isinstance(dtype, pd.CategoricalDtype)` instead
-
-2. **FutureWarning**: `'H'` frequency is deprecated
-   - Location: `tests/test_performance.py:104, 264`
-   - Fix: Use `'h'` instead of `'H'`
-
-3. **UserWarning**: Could not infer datetime format
-   - Location: `src/data_access.py:173`, `src/tools/query_tools.py:177`
-   - Fix: Specify explicit format for `pd.to_datetime()`
-
-## Acceptance Criteria Status
-
-### Scenario 1: 完全自主分析
- ✅ AI 能识别数据类型 (Passed)
- ✅ AI 能推断关键字段的业务含义 (Passed)
- ✅ AI 能自主决定分析维度 (Passed)
- ✅ AI 能生成合理的分析计划 (Passed)
- ⚠️ AI 能执行分析并生成报告 (Integration tests failing due to permissions)
- ✅ 报告包含关键发现和洞察 (Passed)
-
-### Scenario 2: 指定分析方向
- ✅ AI 能理解"健康度"的业务含义 (Passed)
- ✅ AI 能将抽象概念转化为具体指标 (Passed)
- ✅ AI 能根据数据特征选择合适的分析方法 (Passed)
- ✅ AI 能生成针对性的报告 (Passed)
-
-### Scenario 3: 参考模板分析
- ✅ AI 能理解模板的结构和要求 (Passed)
- ✅ AI 能检查数据是否满足模板要求 (Passed)
- ✅ AI 能按模板结构组织报告 (Passed)
- ✅ AI 能灵活调整 (Passed)
-
-### Scenario 4: 迭代深入分析
- ✅ AI 能识别异常或关键发现 (Passed)
- ✅ AI 能自主决定是否需要深入分析 (Passed)
- ✅ AI 能动态调整分析计划 (Passed)
- ✅ AI 能追踪问题的根因 (Passed)
-
-### 工具动态性验收
- ✅ 系统根据数据特征自动启用相关工具 (Passed)
- ✅ 系统根据数据特征自动禁用无关工具 (Passed)
- ✅ AI 能识别需要但缺失的工具 (Passed)
-
-## Recommendations
-
-### High Priority Fixes
-1. Fix permission issues in integration tests (use proper temp directories)
-2. Fix function signature mismatches in property tests
-3. Add health check suppressions for large data tests
-
-### Medium Priority Fixes
-1. Update deprecated pandas API calls
-2. Fix datetime format warnings
-3. Debug full analysis workflow failures
-
-### Low Priority
-1. Optimize large dataset performance (currently 30.44s vs 30s limit)
-2. Verify progress tracking callback counts
-
-## Conclusion
-
-The system has achieved **95.7% test pass rate** with most core functionality working correctly. The failures are primarily:
- **Environmental issues** (permissions, temp directories)
- **Test configuration issues** (health checks, function signatures)
- **Minor performance issues** (0.44s over threshold)
-
-All core acceptance criteria are met, with only integration test failures due to environmental issues preventing full end-to-end validation.
--- a/模块分布.png
+++ b/模块分布.png
--- a/车型工单数量柱状图.png
+++ b/车型工单数量柱状图.png
--- a/问题类型分布.png
+++ b/问题类型分布.png