# Chinese University Thesis Template Guide (中国高校论文模板指南) ## Why This Guide Exists Chinese university thesis templates (.docx) have structural patterns that differ significantly from Western templates. Agents that assume Western conventions (Heading1/Heading2/Normal) will fail repeatedly. This guide documents the ACTUAL patterns found in Chinese templates. ## Common StyleId Patterns ### Pattern A: Numeric IDs (most common in Chinese Word templates) | Style Purpose | styleId | w:name | w:basedOn | |--------------|---------|--------|-----------| | Normal body | `a` | "Normal" | — | | Default paragraph font | `a0` | "Default Paragraph Font" | — | | Heading 1 (章标题) | `1` | "heading 1" | `a` | | Heading 2 (节标题) | `2` | "heading 2" | `a` | | Heading 3 (小节标题) | `3` | "heading 3" | `a` | | TOC 1 | `11` | "toc 1" | `a` | | TOC 2 | `21` | "toc 2" | `a` | | TOC 3 | `31` | "toc 3" | `a` | | Header | `a3` | "header" | `a` | | Footer | `a4` | "footer" | `a` | | Table of Contents heading | `10` | "TOC Heading" | `1` | ### Pattern B: English IDs (less common, usually from international templates) Standard Heading1/Heading2/Heading3/Normal — these follow the Western pattern. ### Pattern C: Mixed (some Chinese, some English) Some templates define custom styles with Chinese names: | Style Purpose | styleId | w:name | |--------------|---------|--------| | 论文标题 | `lunwenbiaoti` | "论文标题" | | 章标题 | `zhangbiaoti` | "章标题" | | 正文 | `zhengwen` | "正文" | ### How to Identify Which Pattern ```bash # Extract all styleIds from the template $CLI analyze --input template.docx --styles-only # Or manually: # unzip template.docx word/styles.xml # Search for w:styleId= in the extracted file ``` Look at the first few styleIds. If you see `1`, `2`, `3`, `a`, `a0` → Pattern A. If you see `Heading1`, `Normal` → Pattern B. ## Standard Thesis Structure Chinese university theses follow a highly standardized structure: ``` ┌─────────────────────────────────────┐ │ 封面 (Cover Page) │ ← Usually 1-2 pages │ - 校名、校徽 │ │ - 论文题目 (title) │ │ - 作者、导师、院系、日期 │ ├─────────────────────────────────────┤ │ 学术诚信承诺书 / 独创性声明 │ ← 1 page │ (Academic Integrity Declaration) │ ├─────────────────────────────────────┤ │ 中文摘要 (Chinese Abstract) │ ← 1-2 pages │ - "摘 要" heading │ │ - Abstract body │ │ - "关键词:" line │ ├─────────────────────────────────────┤ │ 英文摘要 (English Abstract) │ ← 1-2 pages │ - "ABSTRACT" heading │ │ - Abstract body │ │ - "Keywords:" line │ ├─────────────────────────────────────┤ │ 目录 (Table of Contents) │ ← 1-3 pages │ - Often inside SDT block │ │ - Static example entries │ │ - TOC field code │ ├─────────────────────────────────────┤ │ 正文 (Body) │ ← Main content │ 第1章 绪论 │ │ 1.1 研究背景 │ │ 1.2 研究目的和意义 │ │ 第2章 文献综述 │ │ ... │ │ 第N章 结论与展望 │ ├─────────────────────────────────────┤ │ 参考文献 (References) │ ← Styled differently ├─────────────────────────────────────┤ │ 致谢 (Acknowledgments) │ ← Optional ├─────────────────────────────────────┤ │ 附录 (Appendices) │ ← Optional └─────────────────────────────────────┘ ``` ## Identifying Zone Boundaries in Templates Templates contain EXAMPLE content that must be replaced. Here's how to find the zones: ### Zone A (Front matter) — KEEP from template - Starts at: paragraph 0 - Ends at: the paragraph BEFORE the first chapter heading - Contains: cover, declaration, abstracts, TOC - How to detect end: search for first paragraph with style `1` (or Heading1) containing "第1章" or "绪论" ### Zone B (Body content) — REPLACE with user content - Starts at: first chapter heading ("第1章...") - Ends at: "参考文献" heading (inclusive) or last body paragraph before acknowledgments - How to detect: ```python for i, el in enumerate(body_elements): text = get_text(el) style = get_style(el) if style in ('1', 'Heading1') and ('第1章' in text or '绪论' in text): zone_b_start = i if '参考文献' in text: zone_b_end = i ``` ### Zone C (Back matter) — KEEP from template (or remove) - Starts after: 参考文献 - Contains: 致谢, 附录, final sectPr ## Font Expectations in Chinese Thesis Templates | Element | Font | Size (字号) | Size (pt) | w:sz | |---------|------|------------|-----------|------| | 论文标题 | 华文中宋 or 黑体 | 二号 or 小二 | 22pt or 18pt | 44 or 36 | | 章标题 (H1) | 黑体 | 三号 | 16pt | 32 | | 节标题 (H2) | 黑体 | 四号 | 14pt | 28 | | 小节标题 (H3) | 黑体 | 小四 | 12pt | 24 | | 正文 | 宋体 | 小四 | 12pt | 24 | | 页眉 | 宋体 | 五号 | 10.5pt | 21 | | 页脚/页码 | 宋体 | 五号 | 10.5pt | 21 | | 表格内容 | 宋体 | 五号 | 10.5pt | 21 | | 参考文献条目 | 宋体 | 五号 | 10.5pt | 21 | ## RunFonts for CJK Body Text ```xml ``` For headings: ```xml ``` IMPORTANT: When cleaning direct formatting, ALWAYS preserve w:eastAsia. Removing it causes Chinese text to fall back to the wrong font. ## Common Mistakes with Chinese Templates 1. **Searching for `Heading1`** — Chinese templates use `1`, not `Heading1` 2. **Clearing all rFonts** — Must keep eastAsia font declarations 3. **Assuming "第1章" is the first paragraph** — It's typically paragraph 100+ after cover/abstract/TOC 4. **Ignoring SDT blocks in TOC** — The TOC is wrapped in an SDT, not just field codes 5. **Wrong line spacing** — Chinese theses typically use fixed 20pt (line="400") or 22pt (line="440"), not the 28pt used in government documents 6. **Missing section breaks** — Each zone (abstract, TOC, body) usually has its own sectPr for different headers/footers ## Style Mapping Quick Reference When source document uses Western IDs and template uses Chinese numeric IDs: ```json { "Heading1": "1", "Heading2": "2", "Heading3": "3", "Heading4": "3", "Normal": "a", "BodyText": "a", "ListParagraph": "a", "Caption": "a", "TOC1": "11", "TOC2": "21", "TOC3": "31" } ``` When source uses Chinese numeric IDs and template uses Western IDs — reverse the mapping.