feat: 重构成员画像为日周月分层沉淀链路并增强后台摘要能力

本次提交围绕成员画像插件进行了较大升级,核心目标是把原来偏单次、偏近期的成员交互摘要,升级为可随时间沉淀的分层画像能力。

主要功能变更如下:
1. 新增成员分层摘要存储表 t_member_digest,并提供对应的数据库操作层,支持按成员、按群、按摘要类型(daily/weekly/monthly)持久化周期性摘要结果。
2. 在 member_context 插件内新增 MemberDigestService,把画像生成拆分为日摘要、周摘要、月摘要三级处理流程,再由最终画像服务消费这些分层摘要,减少直接反复处理大量原始消息带来的成本和失真。
3. 新增提示词构建模块,分别为日级观察、周级归纳、月级归纳以及最终画像整理提供独立提示词,强调中性、克制、避免敏感推断,并将长期特征与近期状态明确分层。
4. 重写成员最终画像生成逻辑,优先基于日/周/月摘要融合出长期特征、习惯模式、长期回复偏好、近期状态等信息,再用 AI 对分层摘要做最终整理,避免仅依赖近 30 天消息得出偏短期结论。
5. 保留并增强长期画像融合逻辑,通过打分、衰减和重复证据累积,使长期特征随着时间逐步稳定,而不会被单次刷新完全覆盖。
6. 在消息存储层补充成员按时间增量获取、按活跃日期统计、按天取消息等查询方法,为后续分层摘要生成提供数据支撑。
7. 扩展 member_context 插件配置,增加日级摘要消息上限、日摘要最小消息数、单次回填的日摘要数量上限、最终画像使用的日/周/月摘要数量等参数,便于在准确性和系统负载之间做平衡。
8. 后台成员摘要详情页新增长期沟通倾向、长期特征、习惯模式、长期回复偏好、近期状态、历史样本数、分层摘要数量等展示字段,方便观察画像沉淀程度。
9. 优化后台查看成员摘要接口逻辑:首次打开如果还没有摘要,不再同步阻塞生成,而是返回未就绪状态,配合后台手动异步刷新,降低页面卡顿和接口阻塞风险。
10. 增强刷新日志,单成员和群级刷新会输出当前刷新模式以及日/周/月摘要数量,便于排查画像构建进度。
11. 调整当前日、当前周、当前月摘要的重算逻辑,确保新增日摘要写入后,本周和本月摘要不会长期停留在旧版本。

本次提交后,成员画像能力从“基于近期样本的单层摘要”升级为“基于时间沉淀的分层画像管线”,为后续把画像稳定接入 AI 自动回复上下文打下基础,同时尽量保持现有群权限控制和后台异步刷新方式不变。
This commit is contained in:
liuwei
2026-04-02 12:42:28 +08:00
parent 7f21ef4f69
commit 60b72874b5
8 changed files with 1183 additions and 241 deletions

View File

@@ -260,9 +260,13 @@ def api_group_member_context(roomid, wxid):
if not server.member_context_service.is_group_enabled(roomid):
return jsonify({"success": False, "error": "该群未启用成员交互摘要功能"}), 403
context = server.member_context_db.get_member_context(roomid, wxid)
if not context:
context = server.member_context_service.refresh_member_context(roomid, wxid)
return jsonify({"success": True, "data": {"context": context}})
return jsonify({
"success": True,
"data": {
"context": context,
"ready": bool(context)
}
})
except Exception as e:
logger.error(f"获取群成员交互摘要失败: {e}")
return jsonify({"success": False, "error": str(e)}), 500

View File

@@ -262,19 +262,58 @@
<el-button size="mini" type="success" @click="refreshMemberContext">刷新摘要</el-button>
</div>
<el-empty
v-if="!memberContextLoading && !memberContext"
description="暂无后台摘要,可手动刷新后稍等查看。">
</el-empty>
<el-descriptions :column="1" border v-if="memberContext">
<el-descriptions-item label="互动强度">{% raw %}{{ memberContext.activity_level || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="表达特征">{% raw %}{{ memberContext.message_pattern || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="互动风格">{% raw %}{{ memberContext.interaction_style || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="性格倾向(长期观察)">{% raw %}{{ ((memberContext.meta || {}).temperament_tendency) || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="回复建议">{% raw %}{{ memberContext.response_style_hint || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="观察跨度">
{% raw %}{{ ((memberContext.meta || {}).observation_days || 0) }}{% endraw %} 天
<el-tag
v-if="(memberContext.meta || {}).stable_ready"
size="mini"
type="success"
class="context-tag">
已进入长期画像
</el-tag>
<el-tag
v-else
size="mini"
type="info"
class="context-tag">
仍在积累
</el-tag>
</el-descriptions-item>
<el-descriptions-item label="长期关注">
<el-tag v-for="item in (memberContext.topics_of_interest || [])" :key="item" size="mini" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!(memberContext.topics_of_interest || []).length">-</span>
</el-descriptions-item>
<el-descriptions-item label="长期特征">
<el-tag v-for="item in (((memberContext.meta || {}).stable_traits) || [])" :key="item" size="mini" type="warning" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!((((memberContext.meta || {}).stable_traits) || []).length)">-</span>
</el-descriptions-item>
<el-descriptions-item label="习惯模式">
<el-tag v-for="item in (((memberContext.meta || {}).habit_patterns) || [])" :key="item" size="mini" type="info" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!((((memberContext.meta || {}).habit_patterns) || []).length)">-</span>
</el-descriptions-item>
<el-descriptions-item label="长期回复偏好">
<el-tag v-for="item in (((memberContext.meta || {}).long_term_reply_preferences) || [])" :key="item" size="mini" type="success" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!((((memberContext.meta || {}).long_term_reply_preferences) || []).length)">-</span>
</el-descriptions-item>
<el-descriptions-item label="近期话题">
<el-tag v-for="item in (memberContext.recent_focus || [])" :key="item" size="mini" type="success" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!(memberContext.recent_focus || []).length">-</span>
</el-descriptions-item>
<el-descriptions-item label="近期状态">
<el-tag v-for="item in (((memberContext.meta || {}).recent_state) || [])" :key="item" size="mini" type="primary" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!((((memberContext.meta || {}).recent_state) || []).length)">-</span>
</el-descriptions-item>
<el-descriptions-item label="显著特征">
<el-tag v-for="item in ((memberContext.meta || {}).engagement_traits || [])" :key="item" size="mini" type="warning" class="context-tag">{% raw %}{{ item }}{% endraw %}</el-tag>
<span v-if="!(((memberContext.meta || {}).engagement_traits || []).length)">-</span>
@@ -285,6 +324,13 @@
</el-descriptions-item>
<el-descriptions-item label="摘要说明">{% raw %}{{ memberContext.summary_text || '-' }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="样本消息">{% raw %}{{ memberContext.source_message_count || 0 }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="历史样本">{% raw %}{{ ((memberContext.meta || {}).history_message_count) || 0 }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="摘要层级">
日 {% raw %}{{ ((memberContext.meta || {}).digest_daily_count) || 0 }}{% endraw %}
/ 周 {% raw %}{{ ((memberContext.meta || {}).digest_weekly_count) || 0 }}{% endraw %}
/ 月 {% raw %}{{ ((memberContext.meta || {}).digest_monthly_count) || 0 }}{% endraw %}
</el-descriptions-item>
<el-descriptions-item label="累计刷新">{% raw %}{{ ((memberContext.meta || {}).profile_iterations) || 0 }}{% endraw %}</el-descriptions-item>
<el-descriptions-item label="最后更新">{% raw %}{{ memberContext.last_profiled_at || '-' }}{% endraw %}</el-descriptions-item>
</el-descriptions>
</div>

157
db/member_digest_db.py Normal file
View File

@@ -0,0 +1,157 @@
# -*- coding: utf-8 -*-
import json
from datetime import datetime
from typing import Dict, List, Optional
from db.base import BaseDBOperator
from db.connection import DBConnectionManager
class MemberDigestDBOperator(BaseDBOperator):
"""成员分层摘要数据库操作"""
def __init__(self, db_manager: DBConnectionManager):
super().__init__(db_manager)
self._create_tables()
def _create_tables(self):
try:
self.execute_update("""
CREATE TABLE IF NOT EXISTS t_member_digest (
id INT AUTO_INCREMENT PRIMARY KEY,
chatroom_id VARCHAR(64) NOT NULL COMMENT '群聊ID',
wxid VARCHAR(64) NOT NULL COMMENT '成员微信ID',
digest_type VARCHAR(16) NOT NULL COMMENT '摘要类型 daily|weekly|monthly',
period_key VARCHAR(32) NOT NULL COMMENT '周期主键',
period_start DATETIME NULL COMMENT '周期开始时间',
period_end DATETIME NULL COMMENT '周期结束时间',
display_name VARCHAR(128) COMMENT '成员展示名',
source_count INT DEFAULT 0 COMMENT '源数据条数',
summary_text TEXT COMMENT '摘要说明',
structured_json LONGTEXT COMMENT '结构化摘要JSON',
meta_json LONGTEXT COMMENT '附加元数据JSON',
last_generated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '最后生成时间',
create_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
update_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
UNIQUE KEY idx_member_digest (chatroom_id, wxid, digest_type, period_key),
KEY idx_digest_lookup (chatroom_id, wxid, digest_type, period_end)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='成员分层摘要表';
""")
except Exception as e:
self.LOG.error(f"创建成员分层摘要表失败: {e}")
def save_digest(self, digest: Dict) -> bool:
try:
data = {
"chatroom_id": digest.get("chatroom_id", ""),
"wxid": digest.get("wxid", ""),
"digest_type": digest.get("digest_type", ""),
"period_key": digest.get("period_key", ""),
"period_start": digest.get("period_start"),
"period_end": digest.get("period_end"),
"display_name": digest.get("display_name", ""),
"source_count": digest.get("source_count", 0),
"summary_text": digest.get("summary_text", ""),
"structured_json": json.dumps(digest.get("structured", {}), ensure_ascii=False),
"meta_json": json.dumps(digest.get("meta", {}), ensure_ascii=False),
"last_generated_at": digest.get("last_generated_at", datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
}
fields = ", ".join(data.keys())
placeholders = ", ".join(["%s"] * len(data))
update_clause = ", ".join(
[f"{key}=VALUES({key})" for key in data.keys() if key not in ("chatroom_id", "wxid", "digest_type", "period_key")]
)
sql = f"""
INSERT INTO t_member_digest ({fields})
VALUES ({placeholders})
ON DUPLICATE KEY UPDATE {update_clause}
"""
return self.execute_update(sql, tuple(data.values()))
except Exception as e:
self.LOG.error(f"保存成员分层摘要失败: {e}")
return False
def get_digest(self, chatroom_id: str, wxid: str, digest_type: str, period_key: str) -> Optional[Dict]:
try:
sql = """
SELECT *
FROM t_member_digest
WHERE chatroom_id = %s AND wxid = %s AND digest_type = %s AND period_key = %s
LIMIT 1
"""
row = self.execute_query(sql, (chatroom_id, wxid, digest_type, period_key), fetch_one=True)
return self._deserialize_row(row)
except Exception as e:
self.LOG.error(f"获取成员分层摘要失败: {e}")
return None
def list_digests(self, chatroom_id: str, wxid: str, digest_type: str, limit: int = 20) -> List[Dict]:
try:
sql = """
SELECT *
FROM t_member_digest
WHERE chatroom_id = %s AND wxid = %s AND digest_type = %s
ORDER BY period_end DESC, period_key DESC
LIMIT %s
"""
rows = self.execute_query(sql, (chatroom_id, wxid, digest_type, limit)) or []
return [self._deserialize_row(row) for row in rows]
except Exception as e:
self.LOG.error(f"获取成员分层摘要列表失败: {e}")
return []
def list_digest_keys(self, chatroom_id: str, wxid: str, digest_type: str) -> List[str]:
try:
sql = """
SELECT period_key
FROM t_member_digest
WHERE chatroom_id = %s AND wxid = %s AND digest_type = %s
"""
rows = self.execute_query(sql, (chatroom_id, wxid, digest_type)) or []
return [str(row.get("period_key")) for row in rows if row.get("period_key")]
except Exception as e:
self.LOG.error(f"获取成员摘要key失败: {e}")
return []
def list_period_digests(self, chatroom_id: str, wxid: str, digest_type: str,
period_keys: List[str]) -> List[Dict]:
try:
if not period_keys:
return []
placeholders = ", ".join(["%s"] * len(period_keys))
sql = f"""
SELECT *
FROM t_member_digest
WHERE chatroom_id = %s AND wxid = %s AND digest_type = %s AND period_key IN ({placeholders})
ORDER BY period_end ASC, period_key ASC
"""
params = (chatroom_id, wxid, digest_type, *period_keys)
rows = self.execute_query(sql, params) or []
return [self._deserialize_row(row) for row in rows]
except Exception as e:
self.LOG.error(f"批量获取成员分层摘要失败: {e}")
return []
@staticmethod
def _deserialize_row(row: Optional[Dict]) -> Optional[Dict]:
if not row:
return row
for key in ("structured_json", "meta_json"):
value = row.get(key)
if not value:
row[key] = {}
continue
try:
row[key] = json.loads(value)
except Exception:
row[key] = {}
for key in ("period_start", "period_end", "last_generated_at"):
value = row.get(key)
if isinstance(value, datetime):
row[key] = value.strftime("%Y-%m-%d %H:%M:%S")
row["structured"] = row.get("structured_json", {})
row["meta"] = row.get("meta_json", {})
return row

View File

@@ -60,6 +60,68 @@ class MessageStorageDB(BaseDBOperator):
results = self.execute_query(sql, (days, group_id, wxid, limit)) or []
return list(reversed(results))
def get_member_messages_since(self, group_id: str, wxid: str, since_time, limit: int = 200) -> List[Dict]:
"""获取指定时间之后的成员消息"""
sql = """
SELECT timestamp, sender, content, message_type
FROM messages
WHERE timestamp > %s
AND group_id = %s
AND sender = %s
AND message_type IN (1, 49)
AND CHAR_LENGTH(content) BETWEEN 2 AND 500
AND content NOT LIKE '/%%'
ORDER BY timestamp ASC
LIMIT %s
"""
if isinstance(since_time, datetime):
since_time = since_time.strftime("%Y-%m-%d %H:%M:%S")
return self.execute_query(sql, (since_time, group_id, wxid, limit)) or []
def get_member_active_dates(self, group_id: str, wxid: str, days: int = 365) -> List[Dict]:
"""获取成员在指定时间窗口内的活跃日期列表"""
sql = """
SELECT
DATE(timestamp) AS message_date,
COUNT(*) AS msg_count,
MIN(timestamp) AS first_message_time,
MAX(timestamp) AS last_message_time
FROM messages
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL %s DAY)
AND group_id = %s
AND sender = %s
AND message_type IN (1, 49)
AND CHAR_LENGTH(content) BETWEEN 2 AND 500
AND content NOT LIKE '/%%'
GROUP BY DATE(timestamp)
ORDER BY message_date ASC
"""
rows = self.execute_query(sql, (days, group_id, wxid)) or []
for row in rows:
for key in ("message_date", "first_message_time", "last_message_time"):
value = row.get(key)
if isinstance(value, datetime):
row[key] = value.strftime("%Y-%m-%d %H:%M:%S") if key != "message_date" else value.strftime("%Y-%m-%d")
elif value:
row[key] = str(value)
return rows
def get_member_messages_on_date(self, group_id: str, wxid: str, target_date: str, limit: int = 120) -> List[Dict]:
"""获取成员在某一天的消息"""
sql = """
SELECT timestamp, sender, content, message_type
FROM messages
WHERE DATE(timestamp) = %s
AND group_id = %s
AND sender = %s
AND message_type IN (1, 49)
AND CHAR_LENGTH(content) BETWEEN 2 AND 500
AND content NOT LIKE '/%%'
ORDER BY timestamp ASC
LIMIT %s
"""
return self.execute_query(sql, (target_date, group_id, wxid, limit)) or []
def get_message_count_by_date(self, date: str) -> List[Dict]:
"""获取指定日期的消息统计"""
sql = """

View File

@@ -12,6 +12,30 @@ request_timeout = 60
sample_days = 30
sample_message_limit = 80
refresh_limit_per_member = 200
long_term_days = 365
long_term_message_limit = 600
bootstrap_days = 365
bootstrap_message_limit = 600
incremental_message_limit = 80
incremental_recent_days = 7
recalibration_days = 30
daily_message_limit = 120
daily_digest_min_messages = 6
max_daily_digests_per_run = 45
weekly_digest_limit = 16
monthly_digest_limit = 12
final_daily_limit = 8
final_weekly_limit = 6
final_monthly_limit = 6
ai_min_member_messages = 12
active_member_hours = 72
min_member_messages = 3
max_members_per_group_per_run = 30
stale_hours = 24
stable_decay = 0.96
stable_max_items = 6
stable_min_score = 0.9
stable_ready_days = 180
[schedule]
refresh_times = ["04:20"]

View File

@@ -0,0 +1,373 @@
# -*- coding: utf-8 -*-
import json
import re
from collections import defaultdict
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Tuple
import requests
from loguru import logger
from db.contacts_db import ContactsDBOperator
from db.member_digest_db import MemberDigestDBOperator
from db.message_storage import MessageStorageDB
from plugins.member_context.prompt_builder import MemberContextPromptBuilder
class MemberDigestService:
"""成员分层摘要服务"""
def __init__(self, contacts_db: ContactsDBOperator, message_db: MessageStorageDB,
digest_db: MemberDigestDBOperator, plugin_config: Optional[Dict] = None):
self.contacts_db = contacts_db
self.message_db = message_db
self.digest_db = digest_db
self.LOG = logger
self.plugin_config = plugin_config or {}
api_config = self.plugin_config.get("api", {})
profile_config = self.plugin_config.get("profile", {})
self.ai_enabled = bool(api_config.get("enable", api_config.get("enabled", False)))
self.ai_base_url = (api_config.get("base_url") or "").rstrip("/")
self.ai_api_key = api_config.get("api_key", "")
self.ai_endpoint = str(api_config.get("endpoint", "completion-messages")).lstrip("/")
self.ai_timeout = int(api_config.get("request_timeout", 60))
self.bootstrap_days = int(profile_config.get("bootstrap_days", 365))
self.daily_message_limit = int(profile_config.get("daily_message_limit", 120))
self.daily_digest_min_messages = int(profile_config.get("daily_digest_min_messages", 6))
self.max_daily_digests_per_run = int(profile_config.get("max_daily_digests_per_run", 45))
self.weekly_digest_limit = int(profile_config.get("weekly_digest_limit", 16))
self.monthly_digest_limit = int(profile_config.get("monthly_digest_limit", 12))
self.final_daily_limit = int(profile_config.get("final_daily_limit", 8))
self.final_weekly_limit = int(profile_config.get("final_weekly_limit", 6))
self.final_monthly_limit = int(profile_config.get("final_monthly_limit", 6))
def ensure_member_digest_pipeline(self, chatroom_id: str, wxid: str, force: bool = False) -> Dict:
member = self.contacts_db.get_chatroom_member_info(chatroom_id, wxid) or {}
display_name = member.get("display_name") or member.get("nick_name") or wxid
active_dates = self.message_db.get_member_active_dates(chatroom_id, wxid, days=self.bootstrap_days)
if not active_dates:
return {
"display_name": display_name,
"daily_digests": [],
"weekly_digests": [],
"monthly_digests": [],
"stats": {"daily": 0, "weekly": 0, "monthly": 0, "active_days": 0},
}
built_daily = self._ensure_daily_digests(chatroom_id, wxid, display_name, active_dates, force=force)
built_weekly = self._ensure_weekly_digests(chatroom_id, wxid, display_name, force=force)
built_monthly = self._ensure_monthly_digests(chatroom_id, wxid, display_name, force=force)
daily_digests = self.digest_db.list_digests(chatroom_id, wxid, "daily", limit=self.final_daily_limit)
weekly_digests = self.digest_db.list_digests(chatroom_id, wxid, "weekly", limit=self.final_weekly_limit)
monthly_digests = self.digest_db.list_digests(chatroom_id, wxid, "monthly", limit=self.final_monthly_limit)
return {
"display_name": display_name,
"daily_digests": daily_digests,
"weekly_digests": weekly_digests,
"monthly_digests": monthly_digests,
"stats": {
"daily": len(daily_digests),
"weekly": len(weekly_digests),
"monthly": len(monthly_digests),
"active_days": len(active_dates),
"built_daily": built_daily,
"built_weekly": built_weekly,
"built_monthly": built_monthly,
},
}
def _ensure_daily_digests(self, chatroom_id: str, wxid: str, display_name: str,
active_dates: List[Dict], force: bool = False) -> int:
existing_keys = set(self.digest_db.list_digest_keys(chatroom_id, wxid, "daily"))
built = 0
processed = 0
sorted_dates = sorted(active_dates, key=lambda item: str(item.get("message_date")))
current_day = datetime.now().strftime("%Y-%m-%d")
for item in sorted_dates:
period_key = str(item.get("message_date"))
msg_count = int(item.get("msg_count", 0))
if msg_count < self.daily_digest_min_messages:
continue
if not force and period_key in existing_keys and period_key != current_day:
continue
messages = self.message_db.get_member_messages_on_date(
chatroom_id, wxid, period_key, limit=self.daily_message_limit
)
if len(messages) < self.daily_digest_min_messages:
continue
digest = self._build_daily_digest(chatroom_id, wxid, display_name, period_key, messages)
if digest:
self.digest_db.save_digest(digest)
built += 1
processed += 1
self.LOG.info(
f"[成员交互摘要][日摘要] 完成: group={chatroom_id}, wxid={wxid}, "
f"date={period_key}, messages={len(messages)}"
)
if not force and processed >= self.max_daily_digests_per_run:
break
return built
def _ensure_weekly_digests(self, chatroom_id: str, wxid: str, display_name: str, force: bool = False) -> int:
daily_digests = self.digest_db.list_digests(chatroom_id, wxid, "daily", limit=400)
grouped = defaultdict(list)
for item in daily_digests:
week_key, _, _ = self._week_period_bounds(item.get("period_key"))
grouped[week_key].append(item)
existing_keys = set(self.digest_db.list_digest_keys(chatroom_id, wxid, "weekly"))
current_week_key, _, _ = self._week_period_bounds(datetime.now().strftime("%Y-%m-%d"))
built = 0
for week_key, items in sorted(grouped.items()):
if len(items) < 2:
continue
if not force and week_key in existing_keys and week_key != current_week_key:
continue
period_key, period_start, period_end = self._week_period_bounds(items[0].get("period_key"))
digest = self._build_period_digest(
"weekly", chatroom_id, wxid, display_name, period_key, period_start, period_end, items
)
if digest:
self.digest_db.save_digest(digest)
built += 1
self.LOG.info(
f"[成员交互摘要][周摘要] 完成: group={chatroom_id}, wxid={wxid}, "
f"week={period_key}, days={len(items)}"
)
return built
def _ensure_monthly_digests(self, chatroom_id: str, wxid: str, display_name: str, force: bool = False) -> int:
weekly_digests = self.digest_db.list_digests(chatroom_id, wxid, "weekly", limit=200)
grouped = defaultdict(list)
for item in weekly_digests:
month_key, _, _ = self._month_period_bounds(item.get("period_end"))
grouped[month_key].append(item)
existing_keys = set(self.digest_db.list_digest_keys(chatroom_id, wxid, "monthly"))
current_month_key, _, _ = self._month_period_bounds(datetime.now().strftime("%Y-%m-%d"))
built = 0
for month_key, items in sorted(grouped.items()):
if len(items) < 2:
continue
if not force and month_key in existing_keys and month_key != current_month_key:
continue
period_key, period_start, period_end = self._month_period_bounds(items[-1].get("period_end"))
digest = self._build_period_digest(
"monthly", chatroom_id, wxid, display_name, period_key, period_start, period_end, items
)
if digest:
self.digest_db.save_digest(digest)
built += 1
self.LOG.info(
f"[成员交互摘要][月摘要] 完成: group={chatroom_id}, wxid={wxid}, "
f"month={period_key}, weeks={len(items)}"
)
return built
def _build_daily_digest(self, chatroom_id: str, wxid: str, display_name: str,
digest_date: str, messages: List[Dict]) -> Optional[Dict]:
prompt = MemberContextPromptBuilder.build_daily_digest_prompt(
chatroom_id, wxid, display_name, digest_date, messages
)
parsed = self._request_ai_json(prompt, tag=f"daily:{digest_date}", chatroom_id=chatroom_id, wxid=wxid)
if not parsed:
parsed = self._build_daily_digest_fallback(messages)
if not parsed:
return None
period_start = f"{digest_date} 00:00:00"
period_end = f"{digest_date} 23:59:59"
return {
"chatroom_id": chatroom_id,
"wxid": wxid,
"digest_type": "daily",
"period_key": digest_date,
"period_start": period_start,
"period_end": period_end,
"display_name": display_name,
"source_count": len(messages),
"summary_text": parsed.get("summary_text", ""),
"structured": parsed,
"meta": {
"source_type": "messages",
"representative_messages": parsed.get("representative_messages", []),
},
}
def _build_period_digest(self, digest_type: str, chatroom_id: str, wxid: str, display_name: str,
period_key: str, period_start: str, period_end: str,
items: List[Dict]) -> Optional[Dict]:
prompt = MemberContextPromptBuilder.build_period_digest_prompt(
digest_type, chatroom_id, wxid, display_name, period_key, items
)
parsed = self._request_ai_json(prompt, tag=f"{digest_type}:{period_key}", chatroom_id=chatroom_id, wxid=wxid)
if not parsed:
parsed = self._build_period_digest_fallback(digest_type, items)
if not parsed:
return None
return {
"chatroom_id": chatroom_id,
"wxid": wxid,
"digest_type": digest_type,
"period_key": period_key,
"period_start": period_start,
"period_end": period_end,
"display_name": display_name,
"source_count": len(items),
"summary_text": parsed.get("summary_text", ""),
"structured": parsed,
"meta": {
"source_keys": [item.get("period_key") for item in items],
},
}
def _request_ai_json(self, prompt: str, tag: str, chatroom_id: str, wxid: str) -> Optional[Dict]:
if not self.ai_enabled or not self.ai_base_url or not self.ai_api_key:
return None
headers = {
"Authorization": f"Bearer {self.ai_api_key}",
"Content-Type": "application/json",
}
payload = {
"inputs": {"query": prompt},
"response_mode": "blocking",
"user": f"member-digest:{chatroom_id}:{wxid}:{tag}",
}
url = f"{self.ai_base_url}/{self.ai_endpoint}"
try:
self.LOG.info(f"[成员交互摘要][AI] 发起摘要请求: group={chatroom_id}, wxid={wxid}, tag={tag}")
response = requests.post(url, headers=headers, json=payload, timeout=self.ai_timeout)
response.raise_for_status()
data = response.json()
parsed = self._parse_ai_answer(data.get("answer", ""))
if parsed:
usage = (data.get("metadata") or {}).get("usage", {}) or {}
parsed["ai_usage"] = usage
return parsed
except Exception as e:
self.LOG.warning(f"[成员交互摘要][AI] 摘要请求失败: group={chatroom_id}, wxid={wxid}, tag={tag}, error={e}")
return None
def _parse_ai_answer(self, answer: str) -> Optional[Dict]:
if not answer:
return None
text = answer.strip()
match = re.search(r"\{.*\}", text, re.S)
if match:
text = match.group(0)
try:
data = json.loads(text)
except Exception:
return None
normalized = {}
for key, value in data.items():
if isinstance(value, list):
normalized[key] = [str(item).strip() for item in value if str(item).strip()]
elif isinstance(value, (int, float)):
normalized[key] = value
else:
normalized[key] = str(value).strip()
return normalized
def _build_daily_digest_fallback(self, messages: List[Dict]) -> Optional[Dict]:
if not messages:
return None
contents = [str(item.get("content", "")).strip() for item in messages if item.get("content")]
if not contents:
return None
short_samples = [content[:60] for content in contents[:3]]
avg_len = sum(len(content) for content in contents) / max(len(contents), 1)
message_pattern = "短句居多" if avg_len <= 16 else "表达较完整" if avg_len >= 35 else "表达中等长度"
return {
"topics": [],
"interaction_style": "自然跟随式互动",
"message_pattern": message_pattern,
"response_style_hint": "保持简洁自然,先回应核心点",
"habit_signals": [],
"engagement_traits": [],
"reply_taboos": [],
"temperament_signal": "当天样本有限,暂以中性沟通观察为主",
"summary_text": f"当日消息约{len(messages)}条,{message_pattern}",
"representative_messages": short_samples,
"confidence": 0.35,
}
def _build_period_digest_fallback(self, digest_type: str, items: List[Dict]) -> Optional[Dict]:
if not items:
return None
topic_counts = defaultdict(int)
trait_counts = defaultdict(int)
habit_counts = defaultdict(int)
reply_counts = defaultdict(int)
temperament_values = []
for item in items:
structured = item.get("structured", {}) or {}
for topic in structured.get("topics", []) + structured.get("stable_topics", []) + structured.get("long_term_topics", []):
topic_counts[topic] += 1
for trait in structured.get("engagement_traits", []) + structured.get("stable_traits", []):
trait_counts[trait] += 1
for habit in structured.get("habit_signals", []) + structured.get("habit_patterns", []):
habit_counts[habit] += 1
for pref in structured.get("reply_preferences", []) + structured.get("long_term_reply_preferences", []):
reply_counts[pref] += 1
if structured.get("temperament_signal"):
temperament_values.append(structured.get("temperament_signal"))
if structured.get("temperament_tendency"):
temperament_values.append(structured.get("temperament_tendency"))
top_topics = [key for key, _ in sorted(topic_counts.items(), key=lambda item: item[1], reverse=True)[:5]]
top_traits = [key for key, _ in sorted(trait_counts.items(), key=lambda item: item[1], reverse=True)[:5]]
top_habits = [key for key, _ in sorted(habit_counts.items(), key=lambda item: item[1], reverse=True)[:5]]
top_reply = [key for key, _ in sorted(reply_counts.items(), key=lambda item: item[1], reverse=True)[:4]]
temperament = temperament_values[0] if temperament_values else "整体保持中性沟通特征"
if digest_type == "weekly":
return {
"stable_topics": top_topics,
"stable_traits": top_traits,
"habit_patterns": top_habits,
"reply_preferences": top_reply,
"recent_state": top_topics[:3],
"temperament_tendency": temperament,
"summary_text": "本周沟通特征已按重复信号汇总。",
"confidence": 0.45,
}
return {
"long_term_topics": top_topics,
"stable_traits": top_traits,
"habit_patterns": top_habits,
"long_term_reply_preferences": top_reply,
"phase_state": top_topics[:3],
"temperament_tendency": temperament,
"summary_text": "本月沟通特征已按周摘要汇总。",
"confidence": 0.5,
}
@staticmethod
def _week_period_bounds(date_value: str) -> Tuple[str, str, str]:
target_date = datetime.strptime(str(date_value)[:10], "%Y-%m-%d")
week_start = target_date - timedelta(days=target_date.weekday())
week_end = week_start + timedelta(days=6)
week_key = f"{week_start.strftime('%Y-%m-%d')}"
return week_key, week_start.strftime("%Y-%m-%d 00:00:00"), week_end.strftime("%Y-%m-%d 23:59:59")
@staticmethod
def _month_period_bounds(date_value: str) -> Tuple[str, str, str]:
target_dt = datetime.strptime(str(date_value)[:10], "%Y-%m-%d")
month_start = target_dt.replace(day=1)
if month_start.month == 12:
next_month = month_start.replace(year=month_start.year + 1, month=1, day=1)
else:
next_month = month_start.replace(month=month_start.month + 1, day=1)
month_end = next_month - timedelta(days=1)
month_key = month_start.strftime("%Y-%m")
return month_key, month_start.strftime("%Y-%m-%d 00:00:00"), month_end.strftime("%Y-%m-%d 23:59:59")

View File

@@ -0,0 +1,147 @@
# -*- coding: utf-8 -*-
import json
from typing import Dict, List
class MemberContextPromptBuilder:
"""成员分层画像提示词构建器"""
@staticmethod
def build_daily_digest_prompt(chatroom_id: str, wxid: str, display_name: str,
digest_date: str, messages: List[Dict]) -> str:
lines = []
for msg in messages[-80:]:
ts = str(msg.get("timestamp", ""))[11:16]
content = (msg.get("content") or "").replace("\n", " ").strip()[:180]
if content:
lines.append(f"[{ts}] {content}")
return (
"你是微信群后台的成员日观察摘要生成器。\n"
"请仅基于给定的当日公开聊天记录,提取对后续互动有帮助的中性行为观察。\n"
"不要做人格诊断、隐私猜测、负面评价,不要脑补群外信息。\n"
"输出严格 JSON不要 markdown。\n"
"{"
"\"topics\":[\"主题1\"],"
"\"interaction_style\":\"一句中文\","
"\"message_pattern\":\"一句中文\","
"\"response_style_hint\":\"一句中文\","
"\"habit_signals\":[\"信号1\"],"
"\"engagement_traits\":[\"特征1\"],"
"\"reply_taboos\":[\"避坑1\"],"
"\"temperament_signal\":\"一句中文,描述当天显露的沟通倾向,必须克制\","
"\"summary_text\":\"一段不超过100字的日摘要\","
"\"representative_messages\":[\"原话1\",\"原话2\"],"
"\"confidence\":0.0"
"}\n"
"要求:\n"
"1. topics 最多4个habit_signals 最多4个engagement_traits 最多4个reply_taboos 最多3个。\n"
"2. temperament_signal 只能写当日可观察到的沟通倾向,不可上升为长期性格判断。\n"
"3. representative_messages 保留最能代表当天风格的短句最多3条。\n"
f"成员: {display_name} ({wxid})\n"
f"群ID: {chatroom_id}\n"
f"日期: {digest_date}\n"
"当日消息:\n" + ("\n".join(lines) or "暂无")
)
@staticmethod
def build_period_digest_prompt(digest_type: str, chatroom_id: str, wxid: str,
display_name: str, period_key: str, items: List[Dict]) -> str:
structured_lines = []
for item in items:
structured = item.get("structured", {}) or {}
payload = {
"period_key": item.get("period_key"),
"summary_text": item.get("summary_text", ""),
"topics": structured.get("topics") or structured.get("stable_topics") or structured.get("long_term_topics") or [],
"habit_signals": structured.get("habit_signals") or structured.get("habit_patterns") or [],
"engagement_traits": structured.get("engagement_traits") or structured.get("stable_traits") or [],
"reply_preferences": structured.get("reply_preferences") or structured.get("long_term_reply_preferences") or [],
"temperament_signal": structured.get("temperament_signal") or structured.get("temperament_tendency") or "",
"recent_state": structured.get("recent_state") or [],
}
structured_lines.append(json.dumps(payload, ensure_ascii=False))
if digest_type == "weekly":
schema = (
"{"
"\"stable_topics\":[\"主题1\"],"
"\"stable_traits\":[\"特征1\"],"
"\"habit_patterns\":[\"习惯1\"],"
"\"reply_preferences\":[\"偏好1\"],"
"\"recent_state\":[\"状态1\"],"
"\"temperament_tendency\":\"一句中文\","
"\"summary_text\":\"一段不超过120字的周摘要\","
"\"confidence\":0.0"
"}"
)
extra = "请从多个日摘要中提炼本周重复出现的模式,过滤单日噪音。"
else:
schema = (
"{"
"\"long_term_topics\":[\"主题1\"],"
"\"stable_traits\":[\"特征1\"],"
"\"habit_patterns\":[\"习惯1\"],"
"\"long_term_reply_preferences\":[\"偏好1\"],"
"\"phase_state\":[\"状态1\"],"
"\"temperament_tendency\":\"一句中文\","
"\"summary_text\":\"一段不超过140字的月摘要\","
"\"confidence\":0.0"
"}"
)
extra = "请从多个周摘要中提炼阶段性稳定特征,只有反复出现的模式才能进入长期层。"
return (
f"你是微信群后台的成员{digest_type}摘要生成器。\n"
f"{extra}\n"
"不可做心理诊断、负面评价、隐私猜测。输出严格 JSON不要 markdown。\n"
f"{schema}\n"
"要求:\n"
"1. 所有列表字段最多5项必须中性克制。\n"
"2. 只有多个下级摘要反复出现的特征,才允许写进 stable_traits / habit_patterns / long_term_reply_preferences。\n"
"3. recent_state / phase_state 只描述当前阶段状态,不要冒充长期人格。\n"
f"成员: {display_name} ({wxid})\n"
f"群ID: {chatroom_id}\n"
f"周期: {period_key}\n"
"下级摘要:\n" + ("\n".join(structured_lines) or "暂无")
)
@staticmethod
def build_final_context_prompt(chatroom_id: str, wxid: str, display_name: str,
monthly_digests: List[Dict], weekly_digests: List[Dict],
daily_digests: List[Dict]) -> str:
monthly_lines = [json.dumps(item.get("structured", {}), ensure_ascii=False) for item in monthly_digests[:6]]
weekly_lines = [json.dumps(item.get("structured", {}), ensure_ascii=False) for item in weekly_digests[:4]]
daily_lines = [json.dumps(item.get("structured", {}), ensure_ascii=False) for item in daily_digests[:6]]
return (
"你是微信群后台的最终成员交互画像整理器。\n"
"请结合月级、周级、日级摘要,输出一个既有长期层又有近期层的后台交互画像。\n"
"不要做敏感推断、心理诊断、隐私猜测。输出严格 JSON不要 markdown。\n"
"{"
"\"activity_level\":\"高活跃|中活跃|低活跃|观察中\","
"\"message_pattern\":\"一句中文\","
"\"interaction_style\":\"一句中文\","
"\"response_style_hint\":\"一句中文\","
"\"topics_of_interest\":[\"主题1\"],"
"\"recent_focus\":[\"近期主题1\"],"
"\"stable_traits\":[\"长期特征1\"],"
"\"habit_patterns\":[\"习惯1\"],"
"\"long_term_reply_preferences\":[\"偏好1\"],"
"\"recent_state\":[\"近期状态1\"],"
"\"temperament_tendency\":\"一句中文\","
"\"summary_text\":\"一段不超过150字的后台摘要\","
"\"confidence\":0.0,"
"\"engagement_traits\":[\"特征1\"],"
"\"reply_taboos\":[\"避坑1\"]"
"}\n"
"要求:\n"
"1. stable_traits、habit_patterns、long_term_reply_preferences 只从月级和多次重复证据中提取。\n"
"2. recent_focus、recent_state 更依赖最近周级和日级。\n"
"3. summary_text 要像后台备注,不要明显暴露在给用户做画像。\n"
f"成员: {display_name} ({wxid})\n"
f"群ID: {chatroom_id}\n"
"月级摘要:\n" + ("\n".join(monthly_lines) or "暂无")
+ "\n周级摘要:\n" + ("\n".join(weekly_lines) or "暂无")
+ "\n日级摘要:\n" + ("\n".join(daily_lines) or "暂无")
)

View File

@@ -1,6 +1,5 @@
# -*- coding: utf-8 -*-
import json
import math
import re
from collections import Counter
from datetime import datetime
@@ -12,7 +11,10 @@ from loguru import logger
from db.connection import DBConnectionManager
from db.contacts_db import ContactsDBOperator
from db.member_context_db import MemberContextDBOperator
from db.member_digest_db import MemberDigestDBOperator
from db.message_storage import MessageStorageDB
from plugins.member_context.digest_service import MemberDigestService
from plugins.member_context.prompt_builder import MemberContextPromptBuilder
from utils.robot_cmd.robot_command import Feature, GroupBotManager, PermissionStatus
@@ -33,66 +35,122 @@ class MemberContextService:
self.contacts_db = ContactsDBOperator(self.db_manager)
self.message_db = MessageStorageDB(self.db_manager)
self.member_context_db = MemberContextDBOperator(self.db_manager)
self.member_digest_db = MemberDigestDBOperator(self.db_manager)
self.digest_service = MemberDigestService(
self.contacts_db, self.message_db, self.member_digest_db, plugin_config or {}
)
self.LOG = logger
self.plugin_config = plugin_config or {}
api_config = self.plugin_config.get("api", {})
profile_config = self.plugin_config.get("profile", {})
schedule_config = self.plugin_config.get("schedule", {})
self.ai_enabled = bool(api_config.get("enabled", False))
self.ai_enabled = bool(api_config.get("enable", api_config.get("enabled", False)))
self.ai_base_url = (api_config.get("base_url") or "").rstrip("/")
self.ai_api_key = api_config.get("api_key", "")
self.ai_endpoint = str(api_config.get("endpoint", "completion-messages")).lstrip("/")
self.ai_timeout = int(api_config.get("request_timeout", 60))
self.sample_days = int(profile_config.get("sample_days", 30))
self.ai_sample_limit = int(profile_config.get("sample_message_limit", 80))
self.refresh_limit_per_member = int(profile_config.get("refresh_limit_per_member", 200))
self.ai_min_member_messages = int(profile_config.get("ai_min_member_messages", 12))
self.active_member_hours = int(profile_config.get("active_member_hours", 72))
self.min_member_messages = int(profile_config.get("min_member_messages", 3))
self.max_members_per_group_per_run = int(profile_config.get("max_members_per_group_per_run", 30))
self.stale_hours = int(profile_config.get("stale_hours", 24))
schedule_config = self.plugin_config.get("schedule", {})
self.stable_decay = float(profile_config.get("stable_decay", 0.96))
self.stable_max_items = int(profile_config.get("stable_max_items", 6))
self.stable_min_score = float(profile_config.get("stable_min_score", 0.9))
self.stable_ready_days = int(profile_config.get("stable_ready_days", 180))
self.only_recent_active_groups = bool(schedule_config.get("only_recent_active_groups", False))
self.active_hours = int(schedule_config.get("active_hours", 72))
self.min_group_messages = int(schedule_config.get("min_group_messages", 20))
def build_member_context(self, chatroom_id: str, wxid: str, days: Optional[int] = None,
limit: Optional[int] = None) -> Dict:
limit: Optional[int] = None, force_digest_rebuild: bool = False) -> Dict:
days = days or self.sample_days
limit = limit or self.refresh_limit_per_member
existing_context = self.member_context_db.get_member_context(chatroom_id, wxid)
member = self.contacts_db.get_chatroom_member_info(chatroom_id, wxid) or {}
messages = self.message_db.get_member_recent_messages(chatroom_id, wxid, days=days, limit=limit)
recent_messages = self.message_db.get_member_recent_messages(chatroom_id, wxid, days=min(days, 7), limit=100)
display_name = member.get("display_name") or member.get("nick_name") or wxid
activity_level = self._calc_activity_level(len(messages), days)
message_pattern = self._build_message_pattern(messages)
response_style_hint = self._build_response_style_hint(messages)
topics = self._extract_keywords(messages, limit=5)
recent_focus = self._extract_keywords(recent_messages, limit=4)
confidence = self._calc_confidence(len(messages))
digest_snapshot = self.digest_service.ensure_member_digest_pipeline(
chatroom_id, wxid, force=force_digest_rebuild
)
daily_digests = digest_snapshot.get("daily_digests", [])
weekly_digests = digest_snapshot.get("weekly_digests", [])
monthly_digests = digest_snapshot.get("monthly_digests", [])
recent_messages = self.message_db.get_member_recent_messages(chatroom_id, wxid, days=min(days, 7), limit=120)
monthly_structured = [item.get("structured", {}) or {} for item in monthly_digests]
weekly_structured = [item.get("structured", {}) or {} for item in weekly_digests]
daily_structured = [item.get("structured", {}) or {} for item in daily_digests]
observation_days = self._calc_observation_days(daily_digests)
activity_level = self._calc_activity_level(len(recent_messages), max(min(days, 7), 1))
context = {
"chatroom_id": chatroom_id,
"wxid": wxid,
"display_name": display_name,
"activity_level": activity_level,
"message_pattern": message_pattern,
"interaction_style": self._build_interaction_style(messages),
"response_style_hint": response_style_hint,
"topics_of_interest": topics,
"recent_focus": recent_focus,
"summary_text": self._build_summary_text(activity_level, message_pattern, response_style_hint, topics, recent_focus),
"confidence": confidence,
"source_message_count": len(messages),
"message_pattern": self._best_text(
daily_structured, ["message_pattern"], default=self._build_message_pattern(recent_messages)
),
"interaction_style": self._best_text(
daily_structured, ["interaction_style"], default=self._build_interaction_style(recent_messages)
),
"response_style_hint": self._build_response_style_hint_from_digests(
daily_structured, weekly_structured, monthly_structured
),
"topics_of_interest": self._extract_scored_items(
monthly_structured + weekly_structured, ["long_term_topics", "stable_topics", "topics"], limit=5
),
"recent_focus": self._extract_scored_items(daily_structured, ["topics"], limit=4),
"summary_text": "",
"confidence": self._calc_digest_confidence(monthly_digests, weekly_digests, daily_digests),
"source_message_count": len(recent_messages),
"source_days": days,
"last_profiled_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"meta": self._build_meta(messages, recent_messages),
"meta": {
"stable_traits": self._extract_scored_items(
monthly_structured + weekly_structured, ["stable_traits", "engagement_traits"], limit=self.stable_max_items
),
"habit_patterns": self._extract_scored_items(
monthly_structured + weekly_structured + daily_structured,
["habit_patterns", "habit_signals"], limit=self.stable_max_items
),
"long_term_reply_preferences": self._extract_scored_items(
monthly_structured + weekly_structured, ["long_term_reply_preferences", "reply_preferences"], limit=4
),
"recent_state": self._extract_scored_items(
weekly_structured + daily_structured, ["recent_state", "phase_state", "topics"], limit=4
),
"temperament_tendency": self._best_text(
monthly_structured + weekly_structured + daily_structured,
["temperament_tendency", "temperament_signal"], default=""
),
"engagement_traits": self._extract_scored_items(
daily_structured + weekly_structured, ["engagement_traits", "stable_traits"], limit=4
),
"reply_taboos": self._extract_scored_items(daily_structured, ["reply_taboos"], limit=3),
"observation_days": observation_days,
"stable_ready": observation_days >= self.stable_ready_days,
"profile_iterations": int(((existing_context or {}).get("meta", {}) or {}).get("profile_iterations", 0)) + 1,
"history_message_count": self._sum_digest_source_count(daily_digests),
"digest_daily_count": len(daily_digests),
"digest_weekly_count": len(weekly_digests),
"digest_monthly_count": len(monthly_digests),
"last_daily_digest_at": daily_digests[0].get("last_generated_at") if daily_digests else "",
"last_weekly_digest_at": weekly_digests[0].get("last_generated_at") if weekly_digests else "",
"last_monthly_digest_at": monthly_digests[0].get("last_generated_at") if monthly_digests else "",
"refresh_mode": self._build_refresh_mode(existing_context, digest_snapshot),
},
}
ai_context = self._generate_ai_context(chatroom_id, wxid, display_name, context, messages)
ai_context = self._generate_ai_context_from_digests(
chatroom_id, wxid, display_name, monthly_digests, weekly_digests, daily_digests
)
if ai_context:
context.update({
"activity_level": ai_context.get("activity_level") or context["activity_level"],
@@ -105,6 +163,9 @@ class MemberContextService:
"confidence": ai_context.get("confidence", context["confidence"]),
})
context["meta"].update(ai_context.get("meta", {}))
context = self._merge_with_existing_context(existing_context, context)
context["summary_text"] = context.get("summary_text") or self._build_summary_text_from_context(context)
return context
def refresh_member_context(self, chatroom_id: str, wxid: str, days: Optional[int] = None,
@@ -117,6 +178,10 @@ class MemberContextService:
self.LOG.info(
f"[成员交互摘要] 单个成员刷新完成: group={chatroom_id}, wxid={wxid}, "
f"display_name={context.get('display_name', wxid)}, messages={context.get('source_message_count', 0)}, "
f"mode={context.get('meta', {}).get('refresh_mode', '')}, "
f"digests={context.get('meta', {}).get('digest_daily_count', 0)}/"
f"{context.get('meta', {}).get('digest_weekly_count', 0)}/"
f"{context.get('meta', {}).get('digest_monthly_count', 0)}, "
f"ai={'yes' if context.get('meta', {}).get('ai_provider') else 'no'}"
)
return context
@@ -167,7 +232,7 @@ class MemberContextService:
)
continue
context = self.build_member_context(chatroom_id, wxid, days=days, limit=limit_per_member)
if context["source_message_count"] <= 0:
if context["source_message_count"] <= 0 and context.get("meta", {}).get("digest_daily_count", 0) <= 0:
skipped += 1
self.LOG.debug(
f"[成员交互摘要] 跳过成员(样本不足): group={chatroom_id}, "
@@ -181,6 +246,10 @@ class MemberContextService:
f"wxid={wxid}, display_name={context.get('display_name', wxid)}, "
f"messages={context.get('source_message_count', 0)}, "
f"activity={context.get('activity_level', '')}, "
f"mode={context.get('meta', {}).get('refresh_mode', '')}, "
f"digests={context.get('meta', {}).get('digest_daily_count', 0)}/"
f"{context.get('meta', {}).get('digest_weekly_count', 0)}/"
f"{context.get('meta', {}).get('digest_monthly_count', 0)}, "
f"ai={'yes' if context.get('meta', {}).get('ai_provider') else 'no'}"
)
@@ -238,7 +307,10 @@ class MemberContextService:
f"skipped={result.get('skipped', 0)}, active_candidates={result.get('active_candidates', 0)}"
)
self.LOG.info(f"成员交互摘要刷新完成: 启用活跃群={group_count}, 成员={member_count}, 跳过={skipped}, 未启用群={disabled}, 非活跃群={inactive}")
self.LOG.info(
f"成员交互摘要刷新完成: 启用活跃群={group_count}, 成员={member_count}, 跳过={skipped}, "
f"未启用群={disabled}, 非活跃群={inactive}"
)
return {"groups": group_count, "members": member_count, "skipped": skipped, "disabled_groups": disabled, "inactive_groups": inactive}
def is_group_enabled(self, chatroom_id: str) -> bool:
@@ -247,129 +319,6 @@ class MemberContextService:
return True
return GroupBotManager.get_group_permission(chatroom_id, feature) == PermissionStatus.ENABLED
def _calc_activity_level(self, message_count: int, days: int) -> str:
daily_avg = message_count / max(days, 1)
if message_count >= 80 or daily_avg >= 3:
return "高活跃"
if message_count >= 25 or daily_avg >= 1:
return "中活跃"
if message_count > 0:
return "低活跃"
return "观察中"
def _build_message_pattern(self, messages: List[Dict]) -> str:
if not messages:
return "样本较少,暂不做明显模式判断"
contents = [m.get("content", "") for m in messages if m.get("content")]
if not contents:
return "样本较少,暂不做明显模式判断"
avg_len = sum(len(c) for c in contents) / len(contents)
question_ratio = sum(1 for c in contents if "?" in c or "" in c) / len(contents)
link_ratio = sum(1 for c in contents if "http://" in c or "https://" in c) / len(contents)
traits = []
if avg_len <= 12:
traits.append("短句居多")
elif avg_len >= 35:
traits.append("表达较完整")
else:
traits.append("表达中等长度")
if question_ratio >= 0.35:
traits.append("问题导向明显")
elif question_ratio >= 0.15:
traits.append("偶尔连续追问")
if link_ratio >= 0.15:
traits.append("常分享链接或资料")
if not traits:
traits.append("发言较平稳")
return "".join(traits)
def _build_response_style_hint(self, messages: List[Dict]) -> str:
if not messages:
return "样本不足时保持中性、简洁、避免过度熟络"
contents = [m.get("content", "") for m in messages if m.get("content")]
avg_len = sum(len(c) for c in contents) / max(len(contents), 1)
question_ratio = sum(1 for c in contents if "?" in c or "" in c) / max(len(contents), 1)
if question_ratio >= 0.35:
return "优先给明确结论,再补充步骤或依据,避免空泛回应"
if avg_len <= 12:
return "回复尽量简洁直接,先回答核心点,减少铺垫"
if avg_len >= 35:
return "可以给稍完整的解释,但保持结构清楚,避免冗长"
return "保持自然口语化,结论和解释尽量平衡"
def _build_interaction_style(self, messages: List[Dict]) -> str:
if not messages:
return "互动样本较少"
contents = [m.get("content", "") for m in messages if m.get("content")]
question_ratio = sum(1 for c in contents if "?" in c or "" in c) / max(len(contents), 1)
emoji_ratio = sum(1 for c in contents if re.search(r"[\U0001F300-\U0001FAFF\u2600-\u27BF]", c)) / max(len(contents), 1)
mention_ratio = sum(1 for c in contents if "@" in c) / max(len(contents), 1)
parts = []
if question_ratio >= 0.3:
parts.append("偏提问推进")
if emoji_ratio >= 0.15:
parts.append("表情互动感较强")
if mention_ratio >= 0.1:
parts.append("会主动点名互动")
if not parts:
parts.append("自然跟随式互动")
return "".join(parts)
def _extract_keywords(self, messages: List[Dict], limit: int = 5) -> List[str]:
counter = Counter()
for message in messages:
content = message.get("content", "")
for token in self._tokenize(content):
if token in self.STOPWORDS:
continue
counter[token] += 1
return [word for word, _ in counter.most_common(limit)]
def _tokenize(self, text: str) -> List[str]:
chinese_words = re.findall(r"[\u4e00-\u9fff]{2,6}", text)
english_words = re.findall(r"[A-Za-z][A-Za-z0-9_-]{2,20}", text)
return chinese_words + [word.lower() for word in english_words]
def _calc_confidence(self, message_count: int) -> float:
return round(min(0.95, math.log(message_count + 1, 10)), 2) if message_count > 0 else 0.1
def _build_summary_text(self, activity_level: str, message_pattern: str,
response_style_hint: str, topics: List[str], recent_focus: List[str]) -> str:
parts = [
f"近期互动强度:{activity_level}",
f"表达特征:{message_pattern}",
f"回复建议:{response_style_hint}",
]
if topics:
parts.append(f"长期关注:{''.join(topics)}")
if recent_focus:
parts.append(f"近期话题:{''.join(recent_focus)}")
return "".join(parts)
def _build_meta(self, messages: List[Dict], recent_messages: List[Dict]) -> Dict:
latest_time = None
if recent_messages:
latest = recent_messages[-1].get("timestamp")
if isinstance(latest, datetime):
latest_time = latest.strftime("%Y-%m-%d %H:%M:%S")
elif latest:
latest_time = str(latest)
return {
"message_count_30d": len(messages),
"message_count_7d": len(recent_messages),
"latest_message_time": latest_time,
}
def _get_recent_active_members(self, chatroom_id: str) -> List[Dict]:
sql = """
SELECT
@@ -425,7 +374,10 @@ class MemberContextService:
try:
return datetime.strptime(str(value), "%Y-%m-%d %H:%M:%S")
except Exception:
return None
try:
return datetime.strptime(str(value)[:10], "%Y-%m-%d")
except Exception:
return None
def _get_recent_active_chatrooms(self) -> set:
sql = """
@@ -439,14 +391,17 @@ class MemberContextService:
rows = self.message_db.execute_query(sql, ("%@chatroom", self.active_hours, self.min_group_messages)) or []
return {row.get("group_id") for row in rows if row.get("group_id")}
def _generate_ai_context(self, chatroom_id: str, wxid: str, display_name: str,
base_context: Dict, messages: List[Dict]) -> Optional[Dict]:
def _generate_ai_context_from_digests(self, chatroom_id: str, wxid: str, display_name: str,
monthly_digests: List[Dict], weekly_digests: List[Dict],
daily_digests: List[Dict]) -> Optional[Dict]:
if not self.ai_enabled or not self.ai_base_url or not self.ai_api_key:
return None
if len(messages) < self.ai_min_member_messages:
if len(daily_digests) < 2 and len(weekly_digests) < 1 and len(monthly_digests) < 1:
return None
prompt = self._build_ai_prompt(chatroom_id, wxid, display_name, base_context, messages[-self.ai_sample_limit:])
prompt = MemberContextPromptBuilder.build_final_context_prompt(
chatroom_id, wxid, display_name, monthly_digests, weekly_digests, daily_digests
)
headers = {
"Authorization": f"Bearer {self.ai_api_key}",
"Content-Type": "application/json",
@@ -454,88 +409,38 @@ class MemberContextService:
payload = {
"inputs": {"query": prompt},
"response_mode": "blocking",
"user": f"member-context:{chatroom_id}:{wxid}",
"user": f"member-context-final:{chatroom_id}:{wxid}",
}
url = f"{self.ai_base_url}/{self.ai_endpoint}"
try:
self.LOG.info(
f"[成员交互摘要][AI] 发起最终画像请求: group={chatroom_id}, wxid={wxid}, "
f"monthly={len(monthly_digests)}, weekly={len(weekly_digests)}, daily={len(daily_digests)}"
)
response = requests.post(url, headers=headers, json=payload, timeout=self.ai_timeout)
response.raise_for_status()
response_data = response.json()
parsed = self._parse_ai_answer(response_data.get("answer", ""))
data = response.json()
parsed = self._parse_ai_answer(data.get("answer", ""))
if not parsed:
self.LOG.warning(
f"[成员交互摘要][AI] 最终画像JSON解析失败: group={chatroom_id}, wxid={wxid}, "
f"answer_preview={(data.get('answer', '') or '')[:200]}"
)
return None
usage = (response_data.get("metadata") or {}).get("usage", {}) or {}
parsed["meta"] = {
usage = (data.get("metadata") or {}).get("usage", {}) or {}
parsed_meta = parsed.get("meta", {}) or {}
parsed_meta.update({
"ai_provider": "dify",
"ai_mode": "completion",
"ai_tokens": usage.get("total_tokens"),
"ai_latency": usage.get("latency"),
}
})
parsed["meta"] = parsed_meta
return parsed
except Exception as e:
self.LOG.warning(f"成员交互摘要 AI 生成失败,回退到本地摘要: chatroom={chatroom_id}, wxid={wxid}, error={e}")
self.LOG.warning(f"成员交互摘要最终画像 AI 生成失败,回退到本地融合: chatroom={chatroom_id}, wxid={wxid}, error={e}")
return None
def _build_ai_prompt(self, chatroom_id: str, wxid: str, display_name: str,
base_context: Dict, messages: List[Dict]) -> str:
message_lines = []
for msg in messages[-40:]:
ts = msg.get("timestamp")
if isinstance(ts, datetime):
ts = ts.strftime("%m-%d %H:%M")
content = (msg.get("content") or "").replace("\n", " ").strip()
content = content[:160]
if content:
message_lines.append(f"[{ts}] {content}")
topics = "".join(base_context.get("topics_of_interest", [])) or "无明显长期话题"
recent_focus = "".join(base_context.get("recent_focus", [])) or "无明显近期话题"
return (
"你是一个微信群运营后台的成员交互摘要提取器。\n"
"你的任务不是做人设分析,也不是性格判断,而是基于公开聊天记录,提取对后续回复策略有帮助的“交互特征摘要”。\n"
"你只能依据给定聊天样本输出保守结论,不能脑补,不能做敏感推断,不能写负面标签,不能输出隐私猜测。\n"
"请根据以下成员近30天公开发言输出一个严格 JSON 对象,不要 markdown不要解释不要代码块。\n"
"JSON schema:\n"
"{"
"\"activity_level\":\"高活跃|中活跃|低活跃|观察中\","
"\"message_pattern\":\"一句中文,描述表达特点\","
"\"interaction_style\":\"一句中文,描述他在群里如何与人互动\","
"\"response_style_hint\":\"一句中文,描述适合怎样回应\","
"\"topics_of_interest\":[\"主题1\",\"主题2\"],"
"\"recent_focus\":[\"近期主题1\",\"近期主题2\"],"
"\"summary_text\":\"一段不超过120字的后台交互摘要\","
"\"confidence\":0.0,"
"\"engagement_traits\":[\"特征1\",\"特征2\"],"
"\"reply_taboos\":[\"避坑1\",\"避坑2\"]"
"}\n"
"要求:\n"
"1. 只总结群内公开行为特征,不要输出性格诊断、负面标签或敏感结论。\n"
"2. topics_of_interest 表示相对稳定的话题偏好最多5个recent_focus 表示近期频繁提及的话题最多4个。\n"
"3. message_pattern 只能描述可观察到的表达方式,例如:短句居多、问题导向、爱发链接、解释较完整、常接梗互动。\n"
"4. interaction_style 要描述他在群里的参与方式,例如:偏围观后插话、喜欢接梗、会连续追问、偏一对一回应。\n"
"5. response_style_hint 只能写对回复策略有帮助的建议,例如:先给结论再补步骤、保持简洁直接、可以适度接梗;不要写成评价语。\n"
"6. engagement_traits 最多4个写成中性的短标签例如节奏快、爱追问细节、接梗自然、偏结果导向。\n"
"7. reply_taboos 最多3个只写回复时应避免的方式例如避免长篇铺垫、避免过度说教、避免太官方。\n"
"8. summary_text 要像后台备注,客观、中性、克制,不要让人一眼看出是在给用户贴标签。\n"
"9. confidence 取值 0 到 1如果样本较少或不稳定必须降低 confidence。\n"
"10. 如果证据不足,宁可输出更弱、更泛化的结论,也不要瞎猜。\n\n"
"下面是正反例参考。\n"
"坏例子:这个人情绪化、爱抬杠、虚荣、玻璃心。\n"
"好例子:常用短句直接表达观点;遇到问题时更适合先给明确结论,再补充解释。\n\n"
f"成员标识: {display_name} ({wxid})\n"
f"群ID: {chatroom_id}\n"
f"样本消息数: {base_context.get('source_message_count', 0)}\n"
f"本地活跃度估计: {base_context.get('activity_level', '')}\n"
f"本地表达特征: {base_context.get('message_pattern', '')}\n"
f"本地互动风格: {base_context.get('interaction_style', '')}\n"
f"本地回复建议: {base_context.get('response_style_hint', '')}\n"
f"本地长期关注: {topics}\n"
f"本地近期话题: {recent_focus}\n"
"最近消息样本:\n"
+ "\n".join(message_lines)
)
def _parse_ai_answer(self, answer: str) -> Optional[Dict]:
if not answer:
return None
@@ -548,18 +453,10 @@ class MemberContextService:
except Exception:
return None
topics = data.get("topics_of_interest") or []
recent_focus = data.get("recent_focus") or []
engagement_traits = data.get("engagement_traits") or []
reply_taboos = data.get("reply_taboos") or []
if not isinstance(topics, list):
topics = []
if not isinstance(recent_focus, list):
recent_focus = []
if not isinstance(engagement_traits, list):
engagement_traits = []
if not isinstance(reply_taboos, list):
reply_taboos = []
def norm_list(value, limit):
if not isinstance(value, list):
return []
return [str(item).strip() for item in value[:limit] if str(item).strip()]
try:
confidence = float(data.get("confidence", 0))
@@ -571,12 +468,244 @@ class MemberContextService:
"message_pattern": str(data.get("message_pattern", "")).strip(),
"interaction_style": str(data.get("interaction_style", "")).strip(),
"response_style_hint": str(data.get("response_style_hint", "")).strip(),
"topics_of_interest": [str(item).strip() for item in topics[:5] if str(item).strip()],
"recent_focus": [str(item).strip() for item in recent_focus[:4] if str(item).strip()],
"topics_of_interest": norm_list(data.get("topics_of_interest"), 5),
"recent_focus": norm_list(data.get("recent_focus"), 4),
"summary_text": str(data.get("summary_text", "")).strip(),
"confidence": max(0.0, min(1.0, confidence)),
"meta": {
"engagement_traits": [str(item).strip() for item in engagement_traits[:4] if str(item).strip()],
"reply_taboos": [str(item).strip() for item in reply_taboos[:3] if str(item).strip()],
"stable_traits": norm_list(data.get("stable_traits"), self.stable_max_items),
"habit_patterns": norm_list(data.get("habit_patterns"), self.stable_max_items),
"long_term_reply_preferences": norm_list(data.get("long_term_reply_preferences"), 4),
"recent_state": norm_list(data.get("recent_state"), 4),
"temperament_tendency": str(data.get("temperament_tendency", "")).strip(),
"engagement_traits": norm_list(data.get("engagement_traits"), 4),
"reply_taboos": norm_list(data.get("reply_taboos"), 3),
}
}
def _merge_with_existing_context(self, existing_context: Optional[Dict], current_context: Dict) -> Dict:
existing_context = existing_context or {}
existing_meta = existing_context.get("meta", {}) or {}
meta = current_context.get("meta", {}) or {}
observation_days = max(
int(meta.get("observation_days", 0)),
int(existing_meta.get("observation_days", 0)),
)
meta["observation_days"] = observation_days
meta["stable_ready"] = observation_days >= self.stable_ready_days
merged_topic_scores = self._merge_scored_items(
existing_meta.get("topic_scores", {}),
current_context.get("topics_of_interest", []),
current_context.get("confidence", 0),
)
merged_trait_scores = self._merge_scored_items(
existing_meta.get("stable_trait_scores", {}),
meta.get("stable_traits", []),
current_context.get("confidence", 0),
)
merged_habit_scores = self._merge_scored_items(
existing_meta.get("habit_pattern_scores", {}),
meta.get("habit_patterns", []),
current_context.get("confidence", 0),
)
merged_reply_pref_scores = self._merge_scored_items(
existing_meta.get("long_term_reply_preference_scores", {}),
meta.get("long_term_reply_preferences", []),
current_context.get("confidence", 0),
)
merged_temperament_scores = self._merge_scored_items(
existing_meta.get("temperament_tendency_scores", {}),
[meta.get("temperament_tendency")] if meta.get("temperament_tendency") else [],
current_context.get("confidence", 0) * 0.9,
)
meta["topic_scores"] = merged_topic_scores
meta["stable_trait_scores"] = merged_trait_scores
meta["habit_pattern_scores"] = merged_habit_scores
meta["long_term_reply_preference_scores"] = merged_reply_pref_scores
meta["temperament_tendency_scores"] = merged_temperament_scores
meta["stable_traits"] = self._top_scored_items(merged_trait_scores, limit=self.stable_max_items)
meta["habit_patterns"] = self._top_scored_items(merged_habit_scores, limit=self.stable_max_items)
meta["long_term_reply_preferences"] = self._top_scored_items(merged_reply_pref_scores, limit=4)
temperament = self._top_scored_items(merged_temperament_scores, limit=1)
meta["temperament_tendency"] = temperament[0] if temperament else meta.get("temperament_tendency", "")
meta["engagement_traits"] = (meta.get("engagement_traits") or existing_meta.get("engagement_traits") or [])[:4]
meta["reply_taboos"] = (meta.get("reply_taboos") or existing_meta.get("reply_taboos") or [])[:3]
meta["recent_state"] = (meta.get("recent_state") or existing_meta.get("recent_state") or [])[:4]
meta["profile_iterations"] = max(
int(meta.get("profile_iterations", 0)),
int(existing_meta.get("profile_iterations", 0)),
)
meta["history_message_count"] = max(
int(meta.get("history_message_count", 0)),
int(existing_meta.get("history_message_count", 0)),
)
current_context["topics_of_interest"] = self._top_scored_items(merged_topic_scores, limit=5) or current_context.get("topics_of_interest", [])
current_context["recent_focus"] = (current_context.get("recent_focus") or existing_context.get("recent_focus") or [])[:4]
current_context["response_style_hint"] = current_context.get("response_style_hint") or existing_context.get("response_style_hint") or ""
current_context["meta"] = meta
return current_context
def _extract_scored_items(self, items: List[Dict], keys: List[str], limit: int) -> List[str]:
scores = {}
for index, item in enumerate(items):
weight = max(0.5, 1.2 - index * 0.08)
for key in keys:
values = item.get(key, [])
if not isinstance(values, list):
continue
for value in values:
normalized = str(value).strip()
if not normalized:
continue
scores[normalized] = scores.get(normalized, 0.0) + weight
return [key for key, _ in sorted(scores.items(), key=lambda pair: pair[1], reverse=True)[:limit]]
def _best_text(self, items: List[Dict], keys: List[str], default: str = "") -> str:
counter = Counter()
for item in items:
for key in keys:
value = str(item.get(key, "")).strip()
if value:
counter[value] += 1
if counter:
return counter.most_common(1)[0][0]
return default
def _build_response_style_hint_from_digests(self, daily_structured: List[Dict],
weekly_structured: List[Dict],
monthly_structured: List[Dict]) -> str:
hint = self._best_text(daily_structured, ["response_style_hint"])
if hint:
return hint
preferences = self._extract_scored_items(
monthly_structured + weekly_structured,
["long_term_reply_preferences", "reply_preferences"],
limit=3,
)
if preferences:
return "更适合:" + "".join(preferences[:3])
return "保持自然口语化,结论和解释尽量平衡"
def _calc_digest_confidence(self, monthly_digests: List[Dict], weekly_digests: List[Dict],
daily_digests: List[Dict]) -> float:
base = 0.25
base += min(0.35, len(monthly_digests) * 0.08)
base += min(0.2, len(weekly_digests) * 0.04)
base += min(0.15, len(daily_digests) * 0.02)
return round(min(0.95, base), 2)
def _calc_observation_days(self, daily_digests: List[Dict]) -> int:
if not daily_digests:
return 0
end_dt = self._parse_datetime(daily_digests[0].get("period_end"))
start_dt = self._parse_datetime(daily_digests[-1].get("period_start"))
if not start_dt or not end_dt:
return 0
return max(0, (end_dt - start_dt).days)
@staticmethod
def _sum_digest_source_count(daily_digests: List[Dict]) -> int:
return sum(int(item.get("source_count", 0)) for item in daily_digests)
def _build_refresh_mode(self, existing_context: Optional[Dict], digest_snapshot: Dict) -> str:
if not existing_context:
return "bootstrap"
if (digest_snapshot.get("stats", {}) or {}).get("built_monthly", 0) > 0:
return "recalibration"
return "incremental"
def _build_summary_text_from_context(self, context: Dict) -> str:
meta = context.get("meta", {}) or {}
parts = []
if meta.get("temperament_tendency"):
label = "长期沟通倾向" if meta.get("stable_ready") else "阶段性沟通倾向"
parts.append(f"{label}{meta.get('temperament_tendency')}")
if meta.get("stable_traits"):
parts.append(f"长期特征:{''.join(meta.get('stable_traits')[:3])}")
if meta.get("habit_patterns"):
parts.append(f"习惯模式:{''.join(meta.get('habit_patterns')[:3])}")
if meta.get("recent_state"):
parts.append(f"近期状态:{''.join(meta.get('recent_state')[:3])}")
if context.get("response_style_hint"):
parts.append(f"回复建议:{context.get('response_style_hint')}")
return "".join(parts[:5])
def _merge_scored_items(self, existing_scores: Dict, current_items: List[str], confidence: float) -> Dict[str, float]:
merged = {}
for key, value in (existing_scores or {}).items():
try:
score = float(value) * self.stable_decay
except Exception:
continue
if score >= 0.2:
merged[str(key).strip()] = round(score, 4)
boost = max(0.6, min(1.8, 0.8 + confidence))
for item in current_items or []:
normalized = str(item).strip()
if not normalized:
continue
merged[normalized] = round(merged.get(normalized, 0.0) + boost, 4)
return merged
def _top_scored_items(self, scores: Dict, limit: int) -> List[str]:
ordered = sorted(
((str(key).strip(), float(value)) for key, value in (scores or {}).items() if str(key).strip()),
key=lambda item: item[1],
reverse=True,
)
return [key for key, value in ordered if value >= self.stable_min_score][:limit]
def _calc_activity_level(self, message_count: int, days: int) -> str:
daily_avg = message_count / max(days, 1)
if message_count >= 80 or daily_avg >= 3:
return "高活跃"
if message_count >= 25 or daily_avg >= 1:
return "中活跃"
if message_count > 0:
return "低活跃"
return "观察中"
def _build_message_pattern(self, messages: List[Dict]) -> str:
if not messages:
return "样本较少,暂不做明显模式判断"
contents = [m.get("content", "") for m in messages if m.get("content")]
if not contents:
return "样本较少,暂不做明显模式判断"
avg_len = sum(len(c) for c in contents) / len(contents)
question_ratio = sum(1 for c in contents if "?" in c or "" in c) / len(contents)
link_ratio = sum(1 for c in contents if "http://" in c or "https://" in c) / len(contents)
traits = []
if avg_len <= 12:
traits.append("短句居多")
elif avg_len >= 35:
traits.append("表达较完整")
else:
traits.append("表达中等长度")
if question_ratio >= 0.35:
traits.append("问题导向明显")
elif question_ratio >= 0.15:
traits.append("偶尔连续追问")
if link_ratio >= 0.15:
traits.append("常分享链接或资料")
return "".join(traits or ["发言较平稳"])
def _build_interaction_style(self, messages: List[Dict]) -> str:
if not messages:
return "互动样本较少"
contents = [m.get("content", "") for m in messages if m.get("content")]
question_ratio = sum(1 for c in contents if "?" in c or "" in c) / max(len(contents), 1)
emoji_ratio = sum(1 for c in contents if re.search(r"[\U0001F300-\U0001FAFF\u2600-\u27BF]", c)) / max(len(contents), 1)
mention_ratio = sum(1 for c in contents if "@" in c) / max(len(contents), 1)
parts = []
if question_ratio >= 0.3:
parts.append("偏提问推进")
if emoji_ratio >= 0.15:
parts.append("表情互动感较强")
if mention_ratio >= 0.1:
parts.append("会主动点名互动")
return "".join(parts or ["自然跟随式互动"])