新增斗鱼主播背景画像Redis缓存与Dify分支\n\n- 为斗鱼插件补充房间背景画像的Redis读写能力与TTL配置\n- 新增基于LLM生成主播背景画像JSON并回写Redis的链路\n- 将自动画像合并进room_context并在日报生成前预热缓存\n- 扩展Dify工作流，新增room_background_profile主分支与回退分支\n- 更新斗鱼配置示例与工作流文档，说明背景画像缓存用法

2026-04-27 13:04:13 +08:00
parent 889ce5acdd
commit 033fc1202d
4 changed files with 915 additions and 17 deletions
--- a/docs/dify_douyu_daily_report_workflow.md
+++ b/docs/dify_douyu_daily_report_workflow.md
@@ -4,6 +4,7 @@
 - 让 `plugins/douyu` 继续通过一个 Dify Workflow 接收斗鱼日报任务。
 - 但在 Workflow 内部按 `task_type` 做真正的 LLM 分支，而不是让一个通用 LLM 节点同时处理三种风格。
 - 降低运营日报、弹幕总结、粉丝乐子日报之间的风格串台和幻觉风险。
 - 允许额外复用同一个 Workflow，为主播/房间生成可缓存到 Redis 的背景画像 JSON。
 ## 2. 当前推荐结构
 当前推荐的是：
@@ -28,6 +29,7 @@
  - `daily_report`：运营版完整日报正文
  - `danmu_summary`：运营版图片上半部分弹幕总结
  - `fans_daily_report`：粉丝向欢乐恶搞日报
  - `room_background_profile`：主播/房间背景画像 JSON
 - `query`
 - `system_prompt`
 - `user_prompt`
@@ -47,12 +49,13 @@
 最新推荐图结构如下：
 1. `Start`
-2. `if-else` 节点：按 `task_type` 分三条业务线
+2. `if-else` 节点：按 `task_type` 分四条业务线
 3. `运营日报 LLM`
 4. `弹幕总结 LLM`
 5. `粉丝日报 LLM`
-6. 每条业务线各自一个 `fail-branch` 回退 LLM
+6. `背景画像 LLM`
-7. 每条成功路径和回退路径各自输出到 `End.text`
+7. 每条业务线各自一个 `fail-branch` 回退 LLM
 8. 每条成功路径和回退路径各自输出到 `End.text`
 仓库导出文件见：
 - [plugins/douyu/斗鱼日报AI.yml](d:/learn/abot/plugins/douyu/%E6%96%97%E9%B1%BC%E6%97%A5%E6%8A%A5AI.yml)
@@ -71,11 +74,12 @@
 - `fans_daily_report` 只看到粉丝乐子日报规则
 ## 6. if-else 分支规则
-建议 `if-else` 节点至少包含三个 case：
+建议 `if-else` 节点至少包含四个 case：
-1. `danmu_summary_case`
+1. `room_background_profile_case`
-2. `fans_daily_report_case`
+2. `danmu_summary_case`
-3. `daily_report_case`
+3. `fans_daily_report_case`
 4. `daily_report_case`
 推荐默认 `false` 分支回到 `daily_report`，因为项目侧默认值就是 `daily_report`。
@@ -95,6 +99,12 @@
 - 重点写欢乐、现场感、接梗、名场面
 - 不写策略、建议、转化、数据表现
 ### 7.4 背景画像分支
 - 只输出结构化 JSON
 - 优先整理主播领域、职业生涯、相关人物、剧情关键词和梗解释
 - 如果 Workflow 已接搜索/知识库，优先检索公开资料后再整理
 - 如果证据不足，宁可留空并把 `confidence` 设低
 ## 8. 回退 LLM 的设计建议
 不要把三条主分支都挂到同一个通用回退模型。
@@ -102,6 +112,7 @@
 - 运营日报主分支失败 -> 运营日报回退 LLM
 - 弹幕总结主分支失败 -> 弹幕总结回退 LLM
 - 粉丝日报主分支失败 -> 粉丝日报回退 LLM
 - 背景画像主分支失败 -> 背景画像回退 LLM
 这样回退时也不会风格跑偏。
@@ -109,10 +120,11 @@
 本仓库里的最新版导出已经做了这些事：
 1. 新增 `if-else` 节点按 `task_type` 做真实分支
-2. 为三类任务拆分主 LLM
+2. 为四类任务拆分主 LLM
-3. 为三类任务拆分回退 LLM
+3. 为四类任务拆分回退 LLM
 4. 各分支提示词单独收敛，不再共享一段总 prompt
-5. 输出仍统一为 `text`
+5. 背景画像分支固定输出 JSON，可直接被插件清洗后写入 Redis
 6. 输出仍统一为 `text`
 ## 10. 项目配置层是否需要改
 一般不用改 scene 这一层。
@@ -138,11 +150,15 @@
 3. 手动触发 `danmu_summary`
   目标：确认摘要依旧短、像现场，不会拉成长文
 4. 手动触发 `room_background_profile`
   目标：确认返回严格 JSON，并且在无检索证据时会保守留空
 ## 12. 一句结论
 你现在这个判断是对的。
 对斗鱼日报这种“同一份材料，多种输出风格”的任务来说：
 - 插件侧用一个 scene 保持简单
 - Dify 侧用 `if-else + 多 LLM 分支` 保持稳定
 - Redis 侧再缓存一份自动背景画像，能进一步减少重复请求和圈内梗理解偏差
 这是比“一个 LLM 通吃三类任务”更稳、更高效的方案。
--- a/plugins/douyu/config.toml
+++ b/plugins/douyu/config.toml
@@ -26,6 +26,14 @@ daily_report_use_llm = true
 daily_report_max_sessions = 4
 daily_report_max_length = 1800
 daily_report_send_image = true
 # 是否启用“主播背景画像自动整理”：
 # 1. 当手工 room_context_profiles 不完整时，允许调用 LLM 整理一份背景画像；
 # 2. 结果会缓存到 Redis，供运营日报和粉丝日报复用；
 # 3. 如果当前 Dify Workflow 接了搜索/知识库，这里也能顺带吃到检索结果。
 auto_room_background_profile_enable = true
 # 自动背景画像在 Redis 里的缓存时长，默认 7 天。
 # 如果主播资料经常变化，可以酌情调短；如果想减少模型消耗，可以适当调长。
 auto_room_background_profile_ttl_seconds = 604800
 audience_stats_sample_interval_seconds = 0
 # 直播间语义画像（可选）：
--- a/plugins/douyu/main.py
+++ b/plugins/douyu/main.py
@@ -4,6 +4,7 @@ from collections import Counter
 from datetime import datetime, timedelta
 import os
 from pathlib import Path
 import re
 import threading
 import time
 from typing import Dict, Any, List, Optional, Tuple, Set
@@ -405,6 +406,54 @@ class DouyuRedisManager:
        key = f"{self.prefix}room_status:{room_id}"
        return self.redis.set(key, json.dumps(status, ensure_ascii=False))
    def get_room_background_profile(self, room_id: str) -> Optional[Dict[str, Any]]:
        """
        读取房间的“自动背景画像”缓存。
        这里单独拆 key，而不是混进 room_status，主要是为了：
        1. 背景画像更新频率远低于直播状态；
        2. 画像缓存适合设置较长 TTL，和在线状态的实时性要求不同；
        3. 后续若要单独清理/刷新画像，不会影响直播状态主链路。
        """
        key = f"{self.prefix}room_background_profile:{room_id}"
        data = self.redis.get(key)
        if not data:
            return None
        if isinstance(data, bytes):
            data = data.decode("utf-8")
        try:
            return json.loads(data)
        except Exception:
            return None
    def set_room_background_profile(
        self,
        room_id: str,
        profile: Dict[str, Any],
        ttl_seconds: int = 0,
    ) -> bool:
        """
        写入房间背景画像缓存。
        说明：
        1. Redis 中持久化的是“已经清洗过的结构化 JSON”，避免下游每次再解析原始 LLM 文本；
        2. 默认允许带 TTL，便于后续自动过期，减少过时职业信息长期残留；
        3. 不强依赖 TTL，为 0 时按永久 key 写入，兼容本地调试场景。
        """
        key = f"{self.prefix}room_background_profile:{room_id}"
        payload = json.dumps(profile or {}, ensure_ascii=False)
        ttl_seconds = max(int(ttl_seconds or 0), 0)
        if ttl_seconds > 0:
            return bool(self.redis.set(key, payload, ex=ttl_seconds))
        return bool(self.redis.set(key, payload))
    def delete_room_background_profile(self, room_id: str) -> bool:
        """
        删除房间背景画像缓存。
        当前主流程还没有开放手动命令入口，但底层先保留删除能力，
        方便后续做“强制刷新画像”或后台运维修复。
        """
        key = f"{self.prefix}room_background_profile:{room_id}"
        return self.redis.delete(key) >= 0
    def get_room_session(self, room_id: str, session_id: str) -> Optional[Dict[str, Any]]:
        key = f"{self.prefix}room:{room_id}:session:{session_id}"
        data = self.redis.get(key)
@@ -464,9 +513,9 @@ class DouyuRedisManager:
 class DouyuPlugin(MessagePluginInterface):
    # 报告缓存版本号：
    # 1. 版本升级后会自动让历史缓存失效，避免继续复用旧文本/旧图片；
-    # 2. 本次将版本提升到 6，新增“粉丝向恶搞日报”的独立结果类型，并同步刷新旧缓存，
+    # 2. 本次将版本提升到 7，除了粉丝日报分流以外，还加入了 Redis 自动背景画像，
-    #    确保上线后不会误复用旧版图片结构或旧版摘要文案。
+    #    需要强制刷新旧缓存，确保新版 prompt 能吃到最新 room_context。
-    _DAILY_REPORT_CACHE_VERSION = 6
+    _DAILY_REPORT_CACHE_VERSION = 7
    FEATURE_KEY = "DOUYU_MONITOR"
    FEATURE_DESCRIPTION = "🎮 斗鱼开播提醒 [订阅斗鱼 房间号, 取消订阅斗鱼 房间号]"
@@ -524,6 +573,12 @@ class DouyuPlugin(MessagePluginInterface):
        self._daily_report_max_sessions = 4
        self._daily_report_max_length = 1800
        self._daily_report_send_image = True
        # 自动背景画像：
        # 1. 用于在没有手工画像时，让 LLM 基于房间信息整理一份背景；
        # 2. 结果会缓存到 Redis，避免每次生成日报都重复请求模型；
        # 3. 即使模型支持联网/检索，也只把结果当“辅助语境”，不替代真实弹幕证据。
        self._auto_room_background_profile_enable = True
        self._auto_room_background_profile_ttl_seconds = 7 * 24 * 3600
        # Dify 入参策略：
        # 默认发送精简字段，避免某些 Workflow 对复杂对象输入校验严格导致 400。
        # 如需在工作流中使用完整结构化 payload，可在 report_api 显式开启。
@@ -627,6 +682,217 @@ class DouyuPlugin(MessagePluginInterface):
        profile = self._room_context_profiles.get(str(room_id)) or {}
        return dict(profile) if isinstance(profile, dict) else {}
    def _merge_text_list_values(self, preferred: Any, fallback: Any, limit: int = 12) -> List[str]:
        """
        合并两组文本列表，并保证“高优先级来源排前面”。
        这里主要服务“手工画像 + Redis 自动画像”合并场景：
        1. 手工配置的词条优先保留原顺序；
        2. 自动画像只补充缺失项，不覆盖人工判断；
        3. 最终长度受控，避免 prompt 被背景资料无限撑大。
        """
        merged: List[str] = []
        seen: Set[str] = set()
        for raw_values in (preferred, fallback):
            for item in self._normalize_text_list(raw_values):
                marker = item.casefold()
                if marker in seen:
                    continue
                seen.add(marker)
                merged.append(item)
                if len(merged) >= max(int(limit or 0), 1):
                    return merged
        return merged
    def _profile_has_meaningful_content(self, profile: Optional[Dict[str, Any]]) -> bool:
        """
        判断一份背景画像是否“真的有料”。
        只要职业背景、身份摘要、领域、相关人物、剧情词、梗解释等核心字段里有任意有效内容，
        就认为这份画像值得参与合并或缓存复用。
        """
        if not isinstance(profile, dict) or not profile:
            return False
        text_fields = [
            "domain",
            "identity_summary",
            "career_background",
            "evidence_summary",
        ]
        for field in text_fields:
            if str(profile.get(field) or "").strip():
                return True
        list_fields = [
            "domain_keywords",
            "related_people",
            "storyline_keywords",
            "meme_explanations",
            "style_hints",
        ]
        for field in list_fields:
            if self._normalize_text_list(profile.get(field)):
                return True
        return False
    def _profile_needs_auto_enrichment(
        self,
        manual_profile: Optional[Dict[str, Any]],
        cached_profile: Optional[Dict[str, Any]],
        *,
        force_refresh: bool = False,
    ) -> bool:
        """
        判断当前房间是否值得触发一次自动画像生成。
        策略尽量保守：
        1. 手工画像已经比较完整时，不额外消耗模型；
        2. Redis 已有可用缓存时，优先复用；
        3. 只有“手工画像明显缺失/信息过少”时，才触发自动补全。
        """
        if force_refresh:
            return True
        if self._profile_has_meaningful_content(cached_profile):
            return False
        if not self._profile_has_meaningful_content(manual_profile):
            return True
        manual_profile = manual_profile or {}
        filled_core_fields = 0
        for field in ("domain", "identity_summary", "career_background"):
            if str(manual_profile.get(field) or "").strip():
                filled_core_fields += 1
        list_item_count = 0
        for field in ("related_people", "storyline_keywords", "meme_explanations", "style_hints"):
            list_item_count += len(self._normalize_text_list(manual_profile.get(field)))
        return filled_core_fields < 3 or list_item_count < 4
    def _normalize_auto_room_background_profile(self, profile: Dict[str, Any]) -> Dict[str, Any]:
        """
        清洗 LLM 返回的背景画像 JSON。
        目标不是追求字段越多越好，而是保证进入 Redis 的内容：
        1. 结构稳定；
        2. 文本长度可控；
        3. 明确带上置信度与人工复核提示，方便后续在 prompt 中降权使用。
        """
        profile = profile if isinstance(profile, dict) else {}
        confidence = str(profile.get("confidence") or "").strip().lower()
        if confidence not in {"low", "medium", "high"}:
            confidence = "low"
        normalized = {
            "domain": str(profile.get("domain") or "").strip()[:32],
            "domain_keywords": self._normalize_text_list(profile.get("domain_keywords"))[:12],
            "identity_summary": str(profile.get("identity_summary") or "").strip()[:160],
            "career_background": str(profile.get("career_background") or "").strip()[:220],
            "related_people": self._normalize_text_list(profile.get("related_people"))[:12],
            "storyline_keywords": self._normalize_text_list(profile.get("storyline_keywords"))[:12],
            "meme_explanations": self._normalize_text_list(profile.get("meme_explanations"))[:8],
            "style_hints": self._normalize_text_list(profile.get("style_hints"))[:8],
            "confidence": confidence,
            "evidence_summary": str(profile.get("evidence_summary") or "").strip()[:180],
            "needs_human_review": bool(profile.get("needs_human_review", confidence != "high")),
        }
        if not self._profile_has_meaningful_content(normalized):
            return {}
        return normalized
    @staticmethod
    def _extract_json_object_from_text(text: str) -> Optional[Dict[str, Any]]:
        """
        从 LLM 文本里提取 JSON 对象。
        兼容两类常见脏输出：
        1. 模型把 JSON 包在 ```json 代码块里；
        2. 模型前后补了少量解释文字。
        """
        raw = str(text or "").strip()
        if not raw:
            return None
        if raw.startswith("```"):
            raw = re.sub(r"^```(?:json)?", "", raw, flags=re.IGNORECASE).strip()
            if raw.endswith("```"):
                raw = raw[:-3].strip()
        try:
            obj = json.loads(raw)
            return obj if isinstance(obj, dict) else None
        except Exception:
            pass
        start = raw.find("{")
        end = raw.rfind("}")
        if start < 0 or end <= start:
            return None
        candidate = raw[start:end + 1].strip()
        try:
            obj = json.loads(candidate)
            return obj if isinstance(obj, dict) else None
        except Exception:
            return None
    def _merge_room_background_profiles(
        self,
        manual_profile: Dict[str, Any],
        auto_profile: Dict[str, Any],
    ) -> Dict[str, Any]:
        """
        合并手工画像与自动画像。
        优先级固定为：
        1. 手工配置；
        2. Redis 自动画像；
        3. 缺失字段保持空。
        这样可以确保“人工确认过的信息”永远压过模型推断。
        """
        manual_profile = manual_profile if isinstance(manual_profile, dict) else {}
        auto_profile = auto_profile if isinstance(auto_profile, dict) else {}
        has_manual = self._profile_has_meaningful_content(manual_profile)
        has_auto = self._profile_has_meaningful_content(auto_profile)
        if has_manual and has_auto:
            profile_source = "manual+redis_auto"
        elif has_manual:
            profile_source = "manual_config"
        elif has_auto:
            profile_source = "redis_auto"
        else:
            profile_source = ""
        return {
            "domain": str(manual_profile.get("domain") or auto_profile.get("domain") or "").strip(),
            "domain_keywords": self._merge_text_list_values(
                manual_profile.get("domain_keywords"),
                auto_profile.get("domain_keywords"),
            ),
            "identity_summary": str(
                manual_profile.get("identity_summary")
                or auto_profile.get("identity_summary")
                or ""
            ).strip(),
            "career_background": str(
                manual_profile.get("career_background")
                or auto_profile.get("career_background")
                or ""
            ).strip(),
            "related_people": self._merge_text_list_values(
                manual_profile.get("related_people"),
                auto_profile.get("related_people"),
            ),
            "storyline_keywords": self._merge_text_list_values(
                manual_profile.get("storyline_keywords"),
                auto_profile.get("storyline_keywords"),
            ),
            "meme_explanations": self._merge_text_list_values(
                manual_profile.get("meme_explanations"),
                auto_profile.get("meme_explanations"),
                limit=8,
            ),
            "style_hints": self._merge_text_list_values(
                manual_profile.get("style_hints"),
                auto_profile.get("style_hints"),
                limit=8,
            ),
            "profile_source": profile_source,
            "profile_confidence": str(auto_profile.get("confidence") or "").strip().lower(),
            "profile_evidence_summary": str(auto_profile.get("evidence_summary") or "").strip(),
            "profile_needs_human_review": bool(auto_profile.get("needs_human_review", False)),
        }
    def _build_room_semantic_context(
        self,
        room_id: str,
@@ -669,7 +935,11 @@ class DouyuPlugin(MessagePluginInterface):
            ),
        }
-        profile = self._match_room_context_profile(room_id)
+        manual_profile = self._match_room_context_profile(room_id)
        auto_profile = {}
        if self.redis_manager:
            auto_profile = self.redis_manager.get_room_background_profile(room_id) or {}
        profile = self._merge_room_background_profiles(manual_profile, auto_profile)
        category_text = " ".join([
            merged_runtime_context.get("primary_category", ""),
            merged_runtime_context.get("secondary_category", ""),
@@ -701,6 +971,10 @@ class DouyuPlugin(MessagePluginInterface):
            "storyline_keywords": self._normalize_text_list(profile.get("storyline_keywords")),
            "meme_explanations": self._normalize_text_list(profile.get("meme_explanations")),
            "style_hints": self._normalize_text_list(profile.get("style_hints")),
            "profile_source": str(profile.get("profile_source") or "").strip(),
            "profile_confidence": str(profile.get("profile_confidence") or "").strip(),
            "profile_evidence_summary": str(profile.get("profile_evidence_summary") or "").strip(),
            "profile_needs_human_review": bool(profile.get("profile_needs_human_review", False)),
        }
    def _build_room_context_prompt_block(self, payload: Dict[str, Any]) -> str:
@@ -726,10 +1000,24 @@ class DouyuPlugin(MessagePluginInterface):
            )
        if runtime_context.get("tags"):
            parts.append(f"- 房间标签：{'、'.join(self._normalize_text_list(runtime_context.get('tags'))[:8])}。")
        profile_source = str(room_context.get("profile_source") or "").strip()
        if profile_source == "redis_auto":
            parts.append("- 背景资料来源：以下主播背景为系统自动整理后缓存到 Redis，仅作辅助理解；若和当天真实弹幕冲突，以当天弹幕为准。")
        elif profile_source == "manual+redis_auto":
            parts.append("- 背景资料来源：以下信息以手工配置为主，并由 Redis 自动画像补充缺失细节；自动部分只作辅助线索。")
        if room_context.get("identity_summary"):
            parts.append(f"- 主播身份提示：{room_context.get('identity_summary')}。")
        if room_context.get("career_background"):
            parts.append(f"- 职业生涯背景：{room_context.get('career_background')}。")
        if profile_source in {"redis_auto", "manual+redis_auto"}:
            confidence_map = {"high": "高", "medium": "中", "low": "低"}
            confidence_text = confidence_map.get(str(room_context.get("profile_confidence") or "").strip().lower(), "")
            if confidence_text:
                parts.append(f"- 自动背景置信度：{confidence_text}。若出现重名主播、跨圈梗或年份细节，请优先保守解读。")
            if room_context.get("profile_evidence_summary"):
                parts.append(f"- 自动背景备注：{room_context.get('profile_evidence_summary')}。")
            if bool(room_context.get("profile_needs_human_review")):
                parts.append("- 自动背景复核提示：该画像仍建议人工复核，避免把模糊人物关系当成确定事实。")
        related_people = self._normalize_text_list(room_context.get("related_people"))
        if related_people:
            parts.append(f"- 重点相关人物：{'、'.join(related_people[:12])}。弹幕提到这些人时，优先考虑圈内关联。")
@@ -899,6 +1187,18 @@ class DouyuPlugin(MessagePluginInterface):
            self._daily_report_max_sessions = int(cfg.get("daily_report_max_sessions", self._daily_report_max_sessions))
            self._daily_report_max_length = int(cfg.get("daily_report_max_length", self._daily_report_max_length))
            self._daily_report_send_image = bool(cfg.get("daily_report_send_image", self._daily_report_send_image))
            self._auto_room_background_profile_enable = bool(
                cfg.get("auto_room_background_profile_enable", self._auto_room_background_profile_enable)
            )
            self._auto_room_background_profile_ttl_seconds = max(
                int(
                    cfg.get(
                        "auto_room_background_profile_ttl_seconds",
                        self._auto_room_background_profile_ttl_seconds,
                    )
                ),
                3600,
            )
            self._audience_stats_sample_interval_seconds = int(
                cfg.get("audience_stats_sample_interval_seconds", self._audience_stats_sample_interval_seconds)
            )
@@ -2427,6 +2727,292 @@ class DouyuPlugin(MessagePluginInterface):
            inputs["report_payload_json"] = json.dumps(payload, ensure_ascii=False)
        return inputs
    def _build_room_background_profile_seed(self, payload: Dict[str, Any]) -> Dict[str, Any]:
        """
        从日报载荷里抽取一份“适合给背景画像模型看的精简材料”。
        这样做有两个好处：
        1. 不必把整份大 payload 都重新塞给模型，减少 token 和噪音；
        2. 即使模型没有联网能力，也能依据房间标签、代表弹幕、高频词做保守推断。
        """
        meta = payload.get("report_meta", {}) or {}
        room_context = payload.get("room_context", {}) or {}
        runtime_context = room_context.get("runtime_context", {}) or {}
        room_id = str(meta.get("room_id") or "").strip()
        representative_messages = []
        for item in (payload.get("representative_messages", []) or [])[:6]:
            content = str(item.get("content") or "").strip()
            if not content:
                continue
            representative_messages.append({
                "nickname": str(item.get("nickname") or "").strip(),
                "content": content[:90],
            })
        merged_templates = []
        for item in (payload.get("merged_templates", []) or [])[:8]:
            text = str(item.get("text") or "").strip()
            if not text:
                continue
            merged_templates.append({
                "text": text[:48],
                "count": int(item.get("count", 0) or 0),
            })
        repeated_messages = []
        for item in (payload.get("repeated_messages", []) or [])[:6]:
            text = str(item.get("text") or item.get("content") or "").strip()
            if not text:
                continue
            repeated_messages.append({
                "text": text[:48],
                "count": int(item.get("count", 0) or 0),
            })
        manual_profile = self._match_room_context_profile(room_id)
        return {
            "room_meta": {
                "room_id": room_id,
                "nickname": str(meta.get("nickname") or "").strip(),
                "room_name": str(meta.get("room_name") or "").strip(),
                "anchor_day": str(meta.get("anchor_day") or "").strip(),
            },
            "runtime_context": {
                "primary_category": str(runtime_context.get("primary_category") or "").strip(),
                "secondary_category": str(runtime_context.get("secondary_category") or "").strip(),
                "game_name": str(runtime_context.get("game_name") or "").strip(),
                "tags": self._normalize_text_list(runtime_context.get("tags"))[:10],
            },
            "inferred_domains": self._normalize_text_list(room_context.get("inferred_domains"))[:6],
            "top_terms": [
                str(item.get("term") or "").strip()
                for item in (payload.get("top_terms", []) or [])[:12]
                if str(item.get("term") or "").strip()
            ],
            "merged_templates": merged_templates,
            "repeated_messages": repeated_messages,
            "representative_messages": representative_messages,
            # 手工画像快照一并传入，方便模型只补缺、不“推翻人工设定”。
            "manual_profile_hint": {
                "domain": str(manual_profile.get("domain") or "").strip(),
                "identity_summary": str(manual_profile.get("identity_summary") or "").strip(),
                "career_background": str(manual_profile.get("career_background") or "").strip(),
                "related_people": self._normalize_text_list(manual_profile.get("related_people"))[:10],
                "storyline_keywords": self._normalize_text_list(manual_profile.get("storyline_keywords"))[:10],
            },
        }
    def _build_room_background_profile_prompt(self, payload: Dict[str, Any]) -> Tuple[str, str, Dict[str, Any]]:
        """
        构造“主播背景画像”提示词。
        设计原则：
        1. 优先检索公开资料；若当前模型没有检索能力，则退化为保守推断；
        2. 严格要求 JSON 输出，方便直接入 Redis；
        3. 不确定就留空，宁可少写，也不要把职业生涯、圈内关系硬编出来。
        """
        seed = self._build_room_background_profile_seed(payload)
        system_prompt = (
            "你是斗鱼直播间背景画像整理助手。"
            "请根据给定房间信息，整理一份给日报模型使用的主播背景 JSON。"
            "如果你具备联网、搜索、知识库或检索能力，请优先检索公开资料再整理；"
            "如果你不具备检索能力，只能根据输入材料做保守判断，不确定的字段必须留空。"
            "输出必须是 JSON 对象，不要输出代码块，不要补充额外解释。"
        )
        user_prompt = (
            "请只输出一个 JSON 对象，字段固定为：\n"
            "{\n"
            "  \"domain\": \"\",\n"
            "  \"domain_keywords\": [],\n"
            "  \"identity_summary\": \"\",\n"
            "  \"career_background\": \"\",\n"
            "  \"related_people\": [],\n"
            "  \"storyline_keywords\": [],\n"
            "  \"meme_explanations\": [],\n"
            "  \"style_hints\": [],\n"
            "  \"confidence\": \"low|medium|high\",\n"
            "  \"evidence_summary\": \"\",\n"
            "  \"needs_human_review\": true\n"
            "}\n\n"
            "规则：\n"
            "1. identity_summary 要像“这是什么类型主播、观众通常围绕什么背景接梗”的一句话。\n"
            "2. career_background 只写公开且较稳定的职业经历、圈层身份、转型轨迹；不确定就留空。\n"
            "3. related_people 只保留和该主播强相关的人物；不确定不要硬猜。\n"
            "4. meme_explanations 和 style_hints 要服务日报理解，不要写百科长文。\n"
            "5. 如果主播不是 Dota2 主播，也要按其真实领域整理，不要强行往 Dota2 上靠。\n"
            "6. 如果资料存在歧义、重名或证据不足，confidence 设为 low，并把 needs_human_review 设为 true。\n\n"
            f"输入材料：\n{json.dumps(seed, ensure_ascii=False, indent=2)}"
        )
        return system_prompt, user_prompt, seed
    def _build_dify_room_background_inputs(
        self,
        *,
        system_prompt: str,
        user_prompt: str,
        seed: Dict[str, Any],
    ) -> Dict[str, Any]:
        """
        组装“房间背景画像”任务在 Dify Workflow 下的输入。
        这里复用现有 scene，但通过单独 task_type 走到新的 Workflow 分支，
        让 Dify 端可以后续挂检索/知识库节点，而插件侧接口保持不变。
        """
        room_meta = seed.get("room_meta", {}) or {}
        return {
            "task_type": "room_background_profile",
            "query": user_prompt,
            "system_prompt": system_prompt,
            "user_prompt": user_prompt,
            "room_id": str(room_meta.get("room_id") or "").strip(),
            "anchor_day": str(room_meta.get("anchor_day") or "").strip(),
            "nickname": str(room_meta.get("nickname") or room_meta.get("room_name") or "").strip(),
            "max_length": "1200",
            "report_payload_json": json.dumps(seed, ensure_ascii=False),
        }
    def _call_room_background_profile_llm(
        self,
        *,
        system_prompt: str,
        user_prompt: str,
        seed: Dict[str, Any],
    ) -> str:
        """
        调用统一 LLM 客户端生成背景画像文本。
        与日报正文链路保持同样的 provider 兼容策略：
        1. Dify provider 走 workflow/chat 的 run(inputs)；
        2. 其他 provider 走普通 chat(system, user)。
        """
        if not self._daily_report_llm_client:
            return ""
        room_meta = seed.get("room_meta", {}) or {}
        room_id = str(room_meta.get("room_id") or "").strip()
        user_id = f"douyu_room_background_{room_id or 'unknown'}"
        if self._daily_report_llm_client.provider == "dify":
            result = self._daily_report_llm_client.run(
                prompt=user_prompt,
                user=user_id,
                inputs=self._build_dify_room_background_inputs(
                    system_prompt=system_prompt,
                    user_prompt=user_prompt,
                    seed=seed,
                ),
                tag=f"douyu_room_background_{room_id or 'unknown'}",
            )
            return str((result or {}).get("text", "") or "").strip()
        return self._daily_report_llm_client.chat(
            system_prompt,
            user_prompt,
            user_id=user_id,
        ).strip()
    def _generate_room_background_profile(self, payload: Dict[str, Any]) -> Dict[str, Any]:
        """
        同步生成一份可缓存到 Redis 的背景画像。
        这个方法会被 asyncio.to_thread 包裹执行，避免阻塞主事件循环。
        """
        if not self._daily_report_llm_client:
            return {}
        system_prompt, user_prompt, seed = self._build_room_background_profile_prompt(payload)
        response_text = self._call_room_background_profile_llm(
            system_prompt=system_prompt,
            user_prompt=user_prompt,
            seed=seed,
        )
        if not response_text:
            logger.warning(
                f"斗鱼房间背景画像生成失败: room={((seed.get('room_meta', {}) or {}).get('room_id', ''))}, "
                f"last_error={self._daily_report_llm_client.last_error}"
            )
            return {}
        parsed = self._extract_json_object_from_text(response_text)
        if not parsed:
            logger.warning(
                f"斗鱼房间背景画像返回非 JSON，已忽略: room={((seed.get('room_meta', {}) or {}).get('room_id', ''))}, "
                f"preview={response_text[:180]}"
            )
            return {}
        normalized = self._normalize_auto_room_background_profile(parsed)
        if not normalized:
            return {}
        normalized["generated_at"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        normalized["source_mode"] = "redis_auto"
        normalized["generator"] = (
            f"{self._daily_report_llm_client.provider}:{self._daily_report_llm_client.model or self._daily_report_llm_client.endpoint}"
        )
        return normalized
    async def _ensure_room_background_profile(
        self,
        room_id: str,
        nickname: str,
        room_name: str,
        sessions: List[Dict[str, Any]],
        payload: Dict[str, Any],
        *,
        force_refresh: bool = False,
    ) -> Dict[str, Any]:
        """
        在生成日报前，确保房间背景画像已经就绪。
        流程说明：
        1. 先看手工配置与 Redis 缓存是否已经够用；
        2. 仅在必要时才触发一次 LLM 自动画像；
        3. 无论是否生成成功，最后都重新构建 room_context，确保 payload 使用最新缓存。
        """
        if not payload:
            return payload
        meta = payload.get("report_meta", {}) or {}
        room_id = str(room_id or meta.get("room_id") or "").strip()
        nickname = str(nickname or meta.get("nickname") or "").strip()
        room_name = str(room_name or meta.get("room_name") or "").strip()
        if not room_id:
            return payload
        manual_profile = self._match_room_context_profile(room_id)
        cached_profile = (
            self.redis_manager.get_room_background_profile(room_id) if self.redis_manager else {}
        ) or {}
        should_build = (
            self._auto_room_background_profile_enable
            and self._daily_report_use_llm
            and self._daily_report_llm_client is not None
            and self.redis_manager is not None
            and self._profile_needs_auto_enrichment(
                manual_profile,
                cached_profile,
                force_refresh=force_refresh,
            )
        )
        if should_build:
            generated_profile = await asyncio.to_thread(
                self._generate_room_background_profile,
                payload,
            )
            if generated_profile:
                ttl_seconds = max(int(self._auto_room_background_profile_ttl_seconds or 0), 3600)
                self.redis_manager.set_room_background_profile(
                    room_id,
                    generated_profile,
                    ttl_seconds=ttl_seconds,
                )
                logger.info(
                    f"斗鱼房间背景画像已刷新并缓存到 Redis: room={room_id}, "
                    f"ttl={ttl_seconds}s, confidence={generated_profile.get('confidence', '')}"
                )
        # 这里无论是否触发了自动画像，都重新构建一次 room_context：
        # 1. 若刚刚写入 Redis，新画像会立刻反映到 payload；
        # 2. 若没有新画像，也能统一走“手工画像 + Redis 缓存 + 实时房间信息”的最新合并逻辑。
        payload["room_context"] = self._build_room_semantic_context(room_id, nickname, room_name, sessions)
        return payload
    def _call_daily_report_llm(
        self,
        *,
@@ -2763,6 +3349,18 @@ class DouyuPlugin(MessagePluginInterface):
                    f"sessions={len(sessions)}, min_messages={self._daily_report_min_messages}"
                )
                continue
            # 在真正生成日报前先预热一次背景画像：
            # 1. 首次命中房间时尝试补全主播背景；
            # 2. 结果进入 Redis，后续同房间日报可直接复用；
            # 3. payload 会在这里被刷新成最新的 room_context。
            payload = await self._ensure_room_background_profile(
                room_id,
                "",
                "",
                sessions,
                payload,
                force_refresh=force_regenerate,
            )
            report_result = await self._get_or_create_daily_report_result(
                room_id,
                anchor_day,
@@ -2835,6 +3433,15 @@ class DouyuPlugin(MessagePluginInterface):
                    f"sessions={len(sessions)}, min_messages={self._daily_report_min_messages}"
                )
                continue
            # 粉丝日报也需要同一份背景画像，以便更准确理解职业生涯梗、圈内人物和老名场面。
            payload = await self._ensure_room_background_profile(
                room_id,
                "",
                "",
                sessions,
                payload,
                force_refresh=force_regenerate,
            )
            report_result = await self._get_or_create_fans_daily_report_result(
                room_id,
--- a/plugins/douyu/斗鱼日报AI.yml
+++ b/plugins/douyu/斗鱼日报AI.yml
@@ -1,5 +1,5 @@
 app:
-  description: 斗鱼直播日报、弹幕总结与粉丝乐子日报工作流
+  description: 斗鱼直播日报、弹幕总结、粉丝乐子日报与房间背景画像工作流
  icon: 🤖
  icon_background: '#FFEAD5'
  mode: workflow
@@ -99,6 +99,18 @@ workflow:
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
        sourceType: if-else
        targetType: llm
      id: 200000010-room-background-profile-case-200000104-target
      source: '200000010'
      sourceHandle: room_background_profile_case
      target: '200000104'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
@@ -122,6 +134,41 @@ workflow:
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
        sourceType: llm
        targetType: end
      id: 200000104-source-200000307-target
      source: '200000104'
      sourceHandle: source
      target: '200000307'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInLoop: false
        sourceType: llm
        targetType: llm
      id: 200000104-fail-branch-200000204-target
      source: '200000104'
      sourceHandle: fail-branch
      target: '200000204'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
        sourceType: llm
        targetType: end
      id: 200000204-source-200000308-target
      source: '200000204'
      sourceHandle: source
      target: '200000308'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
@@ -245,9 +292,10 @@ workflow:
        # task_type 是整个工作流的业务路由开关：
        # 1. daily_report：运营版完整日报正文；
        # 2. danmu_summary：运营版图片上半部分弹幕总结；
-        # 3. fans_daily_report：粉丝向欢乐恶搞日报。
+        # 3. fans_daily_report：粉丝向欢乐恶搞日报；
        # 4. room_background_profile：主播/房间背景画像 JSON。
        - default: daily_report
-          hint: daily_report / danmu_summary / fans_daily_report
+          hint: daily_report / danmu_summary / fans_daily_report / room_background_profile
          label: task_type
          max_length: 255
          options: []
@@ -333,6 +381,17 @@ workflow:
      width: 242
    - data:
        cases:
        - case_id: room_background_profile_case
          conditions:
          - comparison_operator: contains
            id: room_background_profile_case_cond
            value: room_background_profile
            varType: string
            variable_selector:
            - '200000001'
            - task_type
          id: room_background_profile_case
          logical_operator: and
        - case_id: danmu_summary_case
          conditions:
          - comparison_operator: contains
@@ -671,6 +730,98 @@ workflow:
      targetPosition: left
      type: custom
      width: 242
    - data:
        context:
          enabled: false
          variable_selector: []
        # 背景画像分支：
        # 1. 只服务 room_background_profile；
        # 2. 优先整理公开资料与输入材料，输出结构化 JSON；
        # 3. 不确定时宁可留空，也不要编职业经历或圈内关系。
        error_strategy: fail-branch
        model:
          completion_params:
            temperature: 0.1
          mode: chat
          name: grok-4
          provider: langgenius/openai_api_compatible/openai_api_compatible
        prompt_template:
        - id: background_system_1
          role: system
          text: '你是「斗鱼直播间背景画像助手」。
            你的唯一任务是输出主播/房间背景画像 JSON。
            输出原则：
            1. 只输出 JSON 对象，不要使用代码块，不要输出解释文字。
            2. 如果当前工作流已接入联网、检索或知识库能力，请优先检索公开资料后再整理。
            3. 如果没有检索能力，只能根据输入材料做保守推断，不确定字段必须留空。
            4. 不要把其他同名主播、选手或解说的经历串到当前房间。
            5. 如果主播不是 Dota2 主播，也要按其真实领域整理，不要强行往 Dota2 上靠。
            6. confidence 只能是 low / medium / high。
            7. 如果 system_prompt 非空，优先遵循其中的补充规则。
            '
        - id: background_user_1
          role: user
          text: '【任务类型】
            room_background_profile
            【system_prompt】
            {{#200000001.system_prompt#}}
            【user_prompt】
            {{#200000001.user_prompt#}}
            【meta】
            room_id={{#200000001.room_id#}}, anchor_day={{#200000001.anchor_day#}},
            nickname={{#200000001.nickname#}}
            【report_payload_json】
            {{#200000001.report_payload_json#}}
            请只输出背景画像 JSON。
            '
        retry_config:
          max_retries: 2
          retry_enabled: true
          retry_interval: 1000
        selected: false
        title: 背景画像 LLM
        type: llm
        vision:
          enabled: false
      height: 172
      id: '200000104'
      position:
        x: 664
        y: 650
      positionAbsolute:
        x: 664
        y: 650
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 242
    - data:
        context:
          enabled: false
@@ -821,6 +972,76 @@ workflow:
      targetPosition: left
      type: custom
      width: 242
    - data:
        context:
          enabled: false
          variable_selector: []
        # 背景画像回退模型：
        # 失败时继续保持“只出 JSON、宁可留空不乱编”的保守策略。
        model:
          completion_params:
            temperature: 0.05
          mode: chat
          name: gpt-5.4
          provider: langgenius/openai_api_compatible/openai_api_compatible
        prompt_template:
        - id: background_system_2
          role: system
          text: '你是「斗鱼直播间背景画像助手」。
            当前是回退链路，请稳定输出背景画像 JSON。
            只输出 JSON 对象，不要使用代码块，不要输出额外说明。
            如果证据不足或重名风险较高，字段留空，confidence 设为 low，needs_human_review 设为 true。
            '
        - id: background_user_2
          role: user
          text: '【任务类型】
            room_background_profile
            【system_prompt】
            {{#200000001.system_prompt#}}
            【user_prompt】
            {{#200000001.user_prompt#}}
            【report_payload_json】
            {{#200000001.report_payload_json#}}
            请只输出背景画像 JSON。
            '
        retry_config:
          max_retries: 2
          retry_enabled: true
          retry_interval: 1000
        selected: false
        title: 背景画像回退 LLM
        type: llm
        vision:
          enabled: false
      height: 118
      id: '200000204'
      position:
        x: 1010
        y: 650
      positionAbsolute:
        x: 1010
        y: 650
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 242
    - data:
        context:
          enabled: false
@@ -1034,6 +1255,52 @@ workflow:
      targetPosition: left
      type: custom
      width: 242
    - data:
        outputs:
        - value_selector:
          - '200000104'
          - text
          value_type: string
          variable: text
        selected: false
        title: 背景画像输出
        type: end
      height: 88
      id: '200000307'
      position:
        x: 1010
        y: 760
      positionAbsolute:
        x: 1010
        y: 760
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 242
    - data:
        outputs:
        - value_selector:
          - '200000204'
          - text
          value_type: string
          variable: text
        selected: false
        title: 背景画像回退输出
        type: end
      height: 88
      id: '200000308'
      position:
        x: 1354
        y: 650
      positionAbsolute:
        x: 1354
        y: 650
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 242
    viewport:
      x: 74
      y: 74