Add initial project structure and core functionality for ArchDoc

- Created `.gitignore` files for various directories to exclude unnecessary files.
- Added `PLAN.md` to outline the project goals and architecture documentation generation.
- Implemented the `archdoc-cli` with a command-line interface for initializing and generating documentation.
- Developed the `archdoc-core` library for analyzing Python projects and generating architecture documentation.
- Included caching mechanisms to optimize repeated analysis.
- Established a comprehensive test suite to ensure functionality and error handling.
- Updated `README.md` to provide an overview and installation instructions for ArchDoc.
This commit is contained in:
2026-01-25 20:17:37 +03:00
commit 3701cee205
36 changed files with 7394 additions and 0 deletions

11
.gitignore vendored Normal file
View File

@@ -0,0 +1,11 @@
# IDE files
*.swp
.DS_Store
# Backup files
*.rs.bk
# Project specific files
.archdoc/
.roo/
PLANS/

722
PLAN.md Normal file
View File

@@ -0,0 +1,722 @@
```md
# ArchDoc (V1) — Проектный документ для разработки
**Формат:** PRD + Tech Spec (Python-only, CLI-only)
**Стек реализации:** Rust (CLI), анализ Python через AST, генерация Markdown (diff-friendly)
**Дата:** 2026-01-25
---
## 1. Контекст и проблема
### 1.1. Боль
- Документация архитектуры и связей в кодовой базе устаревает практически сразу.
- В новых чатах LLM не имеет контекста проекта и не понимает “рельсы”: где что лежит, какие модули, какие зависимости критичны.
- В MR/PR сложно быстро оценить архитектурный impact: что поменялось в зависимостях, какие точки “пробило” изменения.
### 1.2. Цель
Сделать CLI-инструмент, который по существующему Python-проекту генерирует и поддерживает **человеко- и LLM-читаемую** документацию:
- от верхнего уровня (папки, модули, “рельсы”)
- до **уровня функций/методов** (что делают и с чем связаны)
при этом обновление должно быть **детерминированным** и **diff-friendly**.
---
## 2. Видение продукта
**ArchDoc** — CLI на Rust, который:
1) сканирует репозиторий Python-проекта,
2) строит модель модулей/файлов/символов и связей (imports + best-effort calls),
3) генерирует/обновляет набор Markdown-файлов так, чтобы `git diff` показывал **смысловые** изменения,
4) создаёт “Obsidian-style” навигацию по ссылкам: индекс → модуль → файл → символ (function/class/method).
---
## 3. Область охвата (V1)
### 3.1. In-scope (обязательно)
- Только **CLI** (без MCP/GUI в V1).
- Только **Python** (в дальнейшем расширяемость под другие языки).
- Документация:
- `ARCHITECTURE.md` как входная точка,
- детальные страницы по модулям и файлам,
- детализация по символам (functions/classes/methods) с связями.
- Связи:
- dependency graph по импортам модулей,
- best-effort call graph на уровне файла/символа,
- inbound/outbound зависимости (кто зависит / от кого зависит).
- Diff-friendly обновление:
- маркерные секции,
- перезапись только генерируемых блоков,
- стабильные ID и сортировки.
### 3.2. Out-of-scope (V1)
- MCP, IDE-интеграции.
- Полный семантический резолв вызовов (уровень LSP/type inference) — только best-effort.
- Визуальная “сеточка графа” — в roadmap (V2+).
- LLM-суммаризация кода — V1 не должен “придумывать”; описание берём из docstring + эвристика.
---
## 4. Основные термины
### 4.1. Symbol (символ)
Именованная сущность, которой можно адресно дать документацию и связи:
- `function` / `async function` (def/async def),
- `class`,
- `method` (внутри class),
- (опционально) module/package как верхнеуровневые сущности.
**Symbol ≠ вызов.**
Symbol — это **определение**, call/reference — **использование**.
---
## 5. Пользовательские сценарии
### S1. init
Пользователь выполняет `archdoc init`:
- создаётся `ARCHITECTURE.md` (в корне проекта),
- создаётся `archdoc.toml` (рекомендуемо) и директория `docs/architecture/*` (если нет).
### S2. generate/update
Пользователь выполняет `archdoc generate` (или `archdoc update`):
- анализирует репозиторий,
- создаёт/обновляет Markdown-артефакты,
- в MR/PR дифф отражает только смысловые изменения.
### S3. check (CI)
`archdoc check`:
- завершает процесс с non-zero кодом, если текущие docs не соответствуют тому, что будет сгенерировано.
---
## 6. Продуктовые принципы (не обсуждаются)
1) **Детерминизм:** один и тот же вход → один и тот же выход.
2) **Diff-friendly:** минимальный шум в `git diff`.
3) **Ручной контент не затираем:** всё вне маркеров — зона ответственности человека.
4) **Без “галлюцинаций”:** связи выводим только из анализа (AST + индекс), иначе помечаем как unresolved/external.
5) **Масштабируемость:** кеширование, инкрементальные обновления, параллельная обработка.
---
## 7. Артефакты вывода
### 7.1. Структура файлов (рекомендуемая)
```
ARCHITECTURE.md
docs/
architecture/
_index.md
rails.md
layout.md
modules/
<module_id>.md
files/
<path_sanitized>.md
````
### 7.2. Обязательные требования к контенту
- `ARCHITECTURE.md` содержит:
- название, описание (manual),
- Created/Updated (Updated меняется **только если** изменилась любая генерируемая секция),
- rails/tooling,
- layout,
- индекс модулей,
- критичные dependency points (fan-in/fan-out/cycles).
- `modules/<module_id>.md` содержит:
- intent (manual),
- boundaries (генерируемое),
- deps inbound/outbound (генерируемое),
- symbols overview (генерируемое).
- `files/<path>.md` содержит:
- intent (manual),
- file imports + deps (генерируемое),
- индекс symbols в файле,
- **один блок на каждый symbol** с назначением и связями.
---
## 8. Diff-friendly обновление (ключевое)
### 8.1. Маркерные секции
Любая генерируемая часть окружена маркерами:
- `<!-- ARCHDOC:BEGIN section=<name> -->`
- `<!-- ARCHDOC:END section=<name> -->`
Для символов:
- `<!-- ARCHDOC:BEGIN symbol id=<symbol_id> -->`
- `<!-- ARCHDOC:END symbol id=<symbol_id> -->`
Инструмент **обновляет только содержимое внутри** этих маркеров.
### 8.2. Ручные секции
Рекомендуемый паттерн:
- `<!-- MANUAL:BEGIN -->`
- `<!-- MANUAL:END -->`
Инструмент не трогает текст в этих блоках и вообще не трогает всё, что вне `ARCHDOC` маркеров.
### 8.3. Детерминированные сортировки
- списки модулей/файлов/символов сортируются лексикографически по стабильному ключу,
- таблицы имеют фиксированный набор колонок и формат,
- запрещены “плавающие” элементы (кроме Updated, который обновляется только при изменениях).
### 8.4. Updated-таймстамп без шума
Правило V1:
- пересчитать контент-хеш генерируемых секций,
- **если** он изменился → обновить `Updated`,
- **иначе** не менять дату.
---
## 9. Stable IDs и якоря
### 9.1. Symbol ID
Формат:
- `py::<module_path>::<qualname>`
Примеры:
- `py::app.billing::apply_promo_code`
- `py::app.services.user::UserService.create_user`
Коллизии:
- добавить `#<short_hash>` (например, от сигнатуры/позиции).
### 9.2. File doc имя
`<relative_path>` конвертируется в:
- `files/<path_sanitized>.md`
- где `path_sanitized` = заменить `/` на `__`
Пример:
- `src/app/billing.py` → `docs/architecture/files/src__app__billing.py.md`
### 9.3. Якоря
Внутри file docs якорь для symbol:
- `#<anchor>` где `<anchor>` = безопасная форма от symbol_id
- дополнительно можно вставить `<a id="..."></a>`.
---
## 10. Python анализ (V1)
### 10.1. Что считаем модулем
- Python package: директория с `__init__.py`
- module: `.py` файл, который принадлежит package/root
Поддержка src-layout:
- конфиг `src_roots = ["src", "."]`
### 10.2. Извлекаем из AST (обязательно)
- `import` / `from ... import ...` + алиасы
- определения: `def`, `async def`, `class`, методы в классах
- docstring (первая строка как “краткое назначение”)
- сигнатура: аргументы, defaults, аннотации типов, return annotation (если есть)
### 10.3. Call graph (best-effort, без type inference)
Резолв вызовов:
- `Name()` вызов `foo()`:
- если `foo` определён в этом файле → связываем на локальный symbol,
- если `foo` импортирован через `from x import foo` (или алиас) → связываем на `x.foo`,
- иначе → `external_call::foo`.
- `Attribute()` вызов `mod.foo()`:
- если `mod` — импортированный модуль/алиас → резолвим к `mod.foo`,
- иначе → `unresolved_method_call::mod.foo`.
Важно: лучше пометить как unresolved, чем “натянуть” неверную связь.
### 10.4. Inbound связи (кто зависит)
- на уровне модулей/файлов: строим обратный граф импортов
- на уровне symbols: строим обратный граф calls там, где вызовы резолвятся
---
## 11. “Что делает функция” (без LLM)
### 11.1. Источник истины: docstring
- `purpose.short` = первая строка docstring
- `purpose.long` (опционально) = первые N строк docstring
### 11.2. Эвристика (если docstring нет)
- по имени: `get_*`, `create_*`, `update_*`, `delete_*`, `sync_*`, `validate_*`
- по признакам в AST:
- наличие HTTP клиентов (`requests/httpx/aiohttp`),
- DB libs (`sqlalchemy/peewee/psycopg/asyncpg`),
- tasks/queue (`celery`, `kafka`, `pika`),
- чтение/запись файлов (`open`, `pathlib`),
- raising exceptions, early returns.
Формат результата: одна строка с меткой `[heuristic]`.
### 11.3. Manual override
- секция “Manual notes” для каждого symbol — зона ручного уточнения.
---
## 12. CLI спецификация
### 12.1. Команды
- `archdoc init`
- создаёт `ARCHITECTURE.md`, `docs/architecture/*`, `archdoc.toml` (если нет)
- `archdoc generate` / `archdoc update`
- анализ + запись/обновление файлов
- `archdoc check`
- проверка: docs совпадают с тем, что будет сгенерировано
### 12.2. Флаги (V1)
- `--root <path>` (default: `.`)
- `--out <path>` (default: `docs/architecture`)
- `--config <path>` (default: `archdoc.toml`)
- `--verbose`
- `--include-tests/--exclude-tests` (можно через конфиг)
---
## 13. Конфигурация (`archdoc.toml`)
Минимальный конфиг V1:
```toml
[project]
root = "."
out_dir = "docs/architecture"
entry_file = "ARCHITECTURE.md"
language = "python"
[scan]
include = ["src", "app", "tests"]
exclude = [".venv", "venv", "__pycache__", ".git", "dist", "build", ".mypy_cache", ".ruff_cache"]
follow_symlinks = false
[python]
src_roots = ["src", "."]
include_tests = true
[output]
single_file = false
per_file_docs = true
[diff]
update_timestamp_on_change_only = true
[thresholds]
critical_fan_in = 20
critical_fan_out = 20
````
---
## 14. Шаблоны Markdown (V1)
### 14.1. `ARCHITECTURE.md` (skeleton)
(Важное: ручные блоки + маркерные генерируемые секции.)
```md
# ARCHITECTURE — <PROJECT_NAME>
<!-- MANUAL:BEGIN -->
## Project summary
**Name:** <PROJECT_NAME>
**Description:** <FILL_MANUALLY: what this project does in 37 lines>
## Key decisions (manual)
- <FILL_MANUALLY>
## Non-goals (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
---
## Document metadata
- **Created:** <AUTO_ON_INIT: YYYY-MM-DD>
- **Updated:** <AUTO_ON_CHANGE: YYYY-MM-DD>
- **Generated by:** archdoc (cli) v0.1
---
## Rails / Tooling
<!-- ARCHDOC:BEGIN section=rails -->
> Generated. Do not edit inside this block.
<AUTO: rails summary + links to config files>
<!-- ARCHDOC:END section=rails -->
---
## Repository layout (top-level)
<!-- ARCHDOC:BEGIN section=layout -->
> Generated. Do not edit inside this block.
<AUTO: table of top-level folders + heuristic purpose + link to layout.md>
<!-- ARCHDOC:END section=layout -->
---
## Modules index
<!-- ARCHDOC:BEGIN section=modules_index -->
> Generated. Do not edit inside this block.
<AUTO: table modules + deps counts + links to module docs>
<!-- ARCHDOC:END section=modules_index -->
---
## Critical dependency points
<!-- ARCHDOC:BEGIN section=critical_points -->
> Generated. Do not edit inside this block.
<AUTO: top fan-in/out symbols + cycles>
<!-- ARCHDOC:END section=critical_points -->
---
<!-- MANUAL:BEGIN -->
## Change notes (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
```
### 14.2. `docs/architecture/layout.md`
```md
# Repository layout
<!-- MANUAL:BEGIN -->
## Manual overrides
- `src/app/` — <FILL_MANUALLY>
<!-- MANUAL:END -->
---
## Detected structure
<!-- ARCHDOC:BEGIN section=layout_detected -->
> Generated. Do not edit inside this block.
<AUTO: table of paths>
<!-- ARCHDOC:END section=layout_detected -->
```
### 14.3. `docs/architecture/modules/<module_id>.md`
```md
# Module: <module_id>
- **Path:** <AUTO>
- **Type:** python package/module
- **Doc:** <AUTO: module docstring summary if any>
<!-- MANUAL:BEGIN -->
## Module intent (manual)
<FILL_MANUALLY: boundaries, responsibility, invariants>
<!-- MANUAL:END -->
---
## Dependencies
<!-- ARCHDOC:BEGIN section=module_deps -->
> Generated. Do not edit inside this block.
<AUTO: outbound/inbound modules + counts>
<!-- ARCHDOC:END section=module_deps -->
---
## Symbols overview
<!-- ARCHDOC:BEGIN section=symbols_overview -->
> Generated. Do not edit inside this block.
<AUTO: table of symbols + links into file docs>
<!-- ARCHDOC:END section=symbols_overview -->
```
### 14.4. `docs/architecture/files/<path_sanitized>.md`
```md
# File: <relative_path>
- **Module:** <AUTO: module_id>
- **Defined symbols:** <AUTO>
- **Imports:** <AUTO>
<!-- MANUAL:BEGIN -->
## File intent (manual)
<FILL_MANUALLY>
<!-- MANUAL:END -->
---
## Imports & file-level dependencies
<!-- ARCHDOC:BEGIN section=file_imports -->
> Generated. Do not edit inside this block.
<AUTO: imports list + outbound modules + inbound files>
<!-- ARCHDOC:END section=file_imports -->
---
## Symbols index
<!-- ARCHDOC:BEGIN section=symbols_index -->
> Generated. Do not edit inside this block.
<AUTO: list of links to symbol anchors>
<!-- ARCHDOC:END section=symbols_index -->
---
## Symbol details
<!-- ARCHDOC:BEGIN symbol id=py::<module>::<qualname> -->
<a id="<anchor>"></a>
### `py::<module>::<qualname>`
- **Kind:** function | class | method
- **Signature:** `<AUTO>`
- **Docstring:** `<AUTO: first line | No docstring>`
- **Defined at:** `<AUTO: line>` (optional)
#### What it does
<!-- ARCHDOC:BEGIN section=purpose -->
<AUTO: docstring-first else heuristic with [heuristic]>
<!-- ARCHDOC:END section=purpose -->
#### Relations
<!-- ARCHDOC:BEGIN section=relations -->
**Outbound calls (best-effort):**
- <AUTO: resolved symbol ids>
- external_call::<name>
- unresolved_method_call::<expr>
**Inbound (used by) (best-effort):**
- <AUTO: callers>
<!-- ARCHDOC:END section=relations -->
#### Integrations (heuristic)
<!-- ARCHDOC:BEGIN section=integrations -->
- HTTP: yes/no
- DB: yes/no
- Queue/Tasks: yes/no
<!-- ARCHDOC:END section=integrations -->
#### Risk / impact
<!-- ARCHDOC:BEGIN section=impact -->
- fan-in: <AUTO:int>
- fan-out: <AUTO:int>
- cycle participant: <AUTO: yes/no>
- critical: <AUTO: yes/no + reason>
<!-- ARCHDOC:END section=impact -->
<!-- MANUAL:BEGIN -->
#### Manual notes
<FILL_MANUALLY>
<!-- MANUAL:END -->
<!-- ARCHDOC:END symbol id=py::<module>::<qualname> -->
```
---
## 15. Техническая архитектура реализации (Rust)
### 15.1. Модули приложения (рекомендуемое разбиение crates/modules)
* `cli` — парсинг аргументов, команды init/generate/check
* `scanner` — обход файлов, ignore, include/exclude
* `python_analyzer` — AST парсер/индексатор (Python)
* `model` — IR структуры данных (ProjectModel)
* `renderer` — генерация Markdown (шаблоны)
* `writer` — diff-aware writer: обновление по маркерам
* `cache` — кеш по хешам файлов (опционально в V1, но желательно)
### 15.2. IR (Intermediate Representation) — схема данных
Минимальные сущности:
**ProjectModel**
* modules: Map<module_id, Module>
* files: Map<file_id, FileDoc>
* symbols: Map<symbol_id, Symbol>
* edges:
* module_import_edges: Vec<Edge> (module → module)
* file_import_edges: Vec<Edge> (file → module/file)
* symbol_call_edges: Vec<Edge> (symbol → symbol/external/unresolved)
**Module**
* id, path, files[], doc_summary
* outbound_modules[], inbound_modules[]
* symbols[]
**FileDoc**
* id, path, module_id
* imports[] (normalized)
* outbound_modules[], inbound_files[]
* symbols[]
**Symbol**
* id, kind, module_id, file_id, qualname
* signature (string), annotations (optional structured)
* docstring_first_line
* purpose (docstring/heuristic)
* outbound_calls[], inbound_calls[]
* integrations flags
* metrics: fan_in, fan_out, is_critical, cycle_participant
**Edge**
* from_id, to_id, edge_type, meta (optional)
---
## 16. Алгоритмы (ключевые)
### 16.1. Scanner
* применить exclude/include и игноры
* собрать список `.py` файлов
* определить src_root и module paths
### 16.2. Python Analyzer
Шаги:
1. Пройти по каждому `.py` файлу
2. Распарсить AST
3. Извлечь:
* imports + алиасы
* defs/classes/methods + сигнатуры + docstrings
* calls (best-effort)
4. Построить Symbol Index: `name → symbol_id` в рамках файла и модуля
5. Резолвить calls через:
* локальные defs
* from-import алиасы
* import module алиасы
6. Построить edges, затем обратные edges (inbound)
### 16.3. Writer (diff-aware)
* загрузить существующий md (если есть)
* найти маркеры секций
* заменить содержимое секции детерминированным рендером
* сохранить всё вне маркеров неизменным
* если файл отсутствует → создать по шаблону
* пересчитать общий “генерируемый хеш”:
* если изменился → обновить `Updated`, иначе оставить
---
## 17. Критичные точки (impact analysis)
Метрики:
* **fan-in(symbol)** = число inbound вызовов (resolved)
* **fan-out(symbol)** = число outbound вызовов (resolved + unresolved по отдельному счётчику)
* **critical**:
* `fan-in >= thresholds.critical_fan_in` OR
* `fan-out >= thresholds.critical_fan_out` OR
* участие в цикле модулей
Выводить top-N списки в `ARCHITECTURE.md`.
---
## 18. Нефункциональные требования
* Время генерации: приемлемо на средних репо (ориентир — минуты, с перспективой кеширования).
* Память: не грузить весь исходный текст в память надолго; хранить только необходимое.
* Безопасность: по умолчанию не включать секреты/бинарники; уважать exclude.
* Надёжность: если AST не парсится (битый файл) — лог + продолжить анализ остальных, пометив файл как failed.
---
## 19. Acceptance Criteria (V1)
1. `archdoc init` создаёт:
* `ARCHITECTURE.md` с manual блоками и маркерами секций
* `docs/architecture/*` с базовыми файлами (или создаёт при generate)
2. Повторный `archdoc generate` на неизменном репо даёт:
* нулевой diff (включая `Updated`, который не меняется без контентных изменений)
3. Изменение одной функции/файла приводит:
* к локальному diff только соответствующего symbol блока и агрегатов (indexes/critical points)
4. `archdoc check` корректно детектит рассинхронизацию и возвращает non-zero.
---
## 20. План релизов (Roadmap)
### V1 (текущий документ)
* Python-only CLI
* modules/files/symbols docs
* import graph + best-effort call graph
* diff-friendly writer
* init/generate/check
### V2 (следующий шаг)
* Экспорт графа в JSON/Mermaid
* Простая локальная HTML/MD визуализация “как в Obsidian” (сетка зависимостей)
* Улучшение резолва calls (больше случаев через алиасы/простые типы)
### V3+
* Подключение других языков (через tree-sitter провайдеры)
* Опционально LSP режим для точного call graph
* MCP/IDE интеграции
---
## 21. Backlog (V1 — минимально достаточный)
### Эпик A — CLI и конфиг
* A1: `init` создаёт skeleton + config
* A2: `generate/update` парсит конфиг и пишет docs
* A3: `check` сравнивает с виртуально сгенерированным выводом
### Эпик B — Python анализ
* B1: scanner и определение module paths
* B2: AST import extraction + алиасы
* B3: defs/classes/methods extraction + signatures/docstrings
* B4: call extraction + best-effort resolution
* B5: inbound/outbound построение графов
### Эпик C — Markdown генерация и writer
* C1: renderer шаблонов
* C2: marker-based replace секций
* C3: stable sorting и формат таблиц
* C4: update timestamp on change only
### Эпик D — Critical points
* D1: fan-in/fan-out метрики
* D2: top lists в ARCHITECTURE.md
* D3: module cycles detection (простая графовая проверка)
---
## 22. Примечания по качеству (сразу закладываем тестируемость)
* Golden-tests: на маленьком fixture repo хранить ожидаемые md и проверять детерминизм.
* Unit-tests на writer: заменить секцию без изменения остального файла.
* Unit-tests на import/call resolution: алиасы `import x as y`, `from x import a as b`.
---
## 23. Итог
V1 фиксирует базовый продукт: **полная архитектурная документация до уровня функций** с зависимостями и impact, обновляемая безопасно и читаемо через `git diff`. Инструмент закрывает задачу: дать LLM и человеку стабильную “карту проекта” и контролировать критичные точки при изменениях.
---
```
```

101
README.md Normal file
View File

@@ -0,0 +1,101 @@
# ArchDoc
ArchDoc is a tool for automatically generating architecture documentation for Python projects. It analyzes your Python codebase and creates comprehensive documentation that helps developers understand the structure, dependencies, and key components of the project.
## Features
- **Automatic Documentation Generation**: Automatically generates architecture documentation from Python source code
- **AST-Based Analysis**: Uses Python AST to extract imports, definitions, and function calls
- **Diff-Aware Updates**: Preserves manual content while updating generated sections
- **Caching**: Caches analysis results for faster subsequent runs
- **Configurable**: Highly configurable through `archdoc.toml`
- **Template-Based Rendering**: Uses Handlebars templates for customizable output
## Installation
To install ArchDoc, you'll need Rust installed on your system. Then run:
```bash
cargo install --path archdoc-cli
```
## Usage
### Initialize Configuration
First, initialize the configuration in your project:
```bash
archdoc init
```
This creates an `archdoc.toml` file with default settings.
### Generate Documentation
Generate architecture documentation for your project:
```bash
archdoc generate
```
This will create documentation files in the configured output directory.
### Check Documentation Consistency
Verify that your documentation is consistent with the code:
```bash
archdoc check
```
## Configuration
ArchDoc is configured through an `archdoc.toml` file. Here's an example configuration:
```toml
[project]
root = "."
out_dir = "docs/architecture"
entry_file = "ARCHITECTURE.md"
language = "python"
[scan]
include = ["src"]
exclude = [".venv", "venv", "__pycache__", ".git", "dist", "build"]
[python]
src_roots = ["src"]
include_tests = true
parse_docstrings = true
[analysis]
resolve_calls = true
detect_integrations = true
[output]
single_file = false
per_file_docs = true
create_directories = true
[caching]
enabled = true
cache_dir = ".archdoc/cache"
max_cache_age = "24h"
```
## How It Works
1. **Scanning**: ArchDoc scans your project directory for Python files based on the configuration
2. **Parsing**: It parses each Python file using AST to extract structure and relationships
3. **Analysis**: It analyzes the code to identify imports, definitions, and function calls
4. **Documentation Generation**: It generates documentation using templates
5. **Output**: It writes the documentation to files, preserving manual content
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.

9
archdoc-cli/.gitignore vendored Normal file
View File

@@ -0,0 +1,9 @@
# Compiled files
target/
# IDE files
*.swp
.DS_Store
# Backup files
*.rs.bk

1780
archdoc-cli/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

16
archdoc-cli/Cargo.toml Normal file
View File

@@ -0,0 +1,16 @@
[package]
name = "archdoc-cli"
version = "0.1.0"
edition = "2024"
[dependencies]
archdoc-core = { path = "../archdoc-core" }
clap = { version = "4.0", features = ["derive"] }
tokio = { version = "1.0", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
toml = "0.8"
tracing = "0.1"
tracing-subscriber = "0.3"
anyhow = "1.0"
thiserror = "1.0"

400
archdoc-cli/src/main.rs Normal file
View File

@@ -0,0 +1,400 @@
use clap::{Parser, Subcommand};
use anyhow::Result;
use archdoc_core::{Config, ProjectModel, scanner::FileScanner, python_analyzer::PythonAnalyzer};
use std::path::Path;
/// CLI interface for ArchDoc
#[derive(Parser)]
#[command(name = "archdoc")]
#[command(about = "Generate architecture documentation for Python projects")]
#[command(version = "0.1.0")]
pub struct Cli {
#[command(subcommand)]
command: Commands,
/// Verbose output
#[arg(short, long, global = true)]
verbose: bool,
}
#[derive(Subcommand)]
enum Commands {
/// Initialize archdoc in the project
Init {
/// Project root directory
#[arg(short, long, default_value = ".")]
root: String,
/// Output directory for documentation
#[arg(short, long, default_value = "docs/architecture")]
out: String,
},
/// Generate or update documentation
Generate {
/// Project root directory
#[arg(short, long, default_value = ".")]
root: String,
/// Output directory for documentation
#[arg(short, long, default_value = "docs/architecture")]
out: String,
/// Configuration file path
#[arg(short, long, default_value = "archdoc.toml")]
config: String,
},
/// Check if documentation is up to date
Check {
/// Project root directory
#[arg(short, long, default_value = ".")]
root: String,
/// Configuration file path
#[arg(short, long, default_value = "archdoc.toml")]
config: String,
},
}
fn main() -> Result<()> {
let cli = Cli::parse();
// Setup logging based on verbose flag
setup_logging(cli.verbose)?;
match &cli.command {
Commands::Init { root, out } => {
init_project(root, out)?;
}
Commands::Generate { root, out, config } => {
let config = load_config(config)?;
let model = analyze_project(root, &config)?;
generate_docs(&model, out)?;
}
Commands::Check { root, config } => {
let config = load_config(config)?;
check_docs_consistency(root, &config)?;
}
}
Ok(())
}
fn setup_logging(verbose: bool) -> Result<()> {
// TODO: Implement logging setup
println!("Setting up logging with verbose={}", verbose);
Ok(())
}
fn init_project(root: &str, out: &str) -> Result<()> {
// TODO: Implement project initialization
println!("Initializing project at {} with output to {}", root, out);
// Create output directory
let out_path = std::path::Path::new(out);
std::fs::create_dir_all(out_path)
.map_err(|e| anyhow::anyhow!("Failed to create output directory: {}", e))?;
// Create docs/architecture directory structure
let docs_arch_path = out_path.join("docs").join("architecture");
std::fs::create_dir_all(&docs_arch_path)
.map_err(|e| anyhow::anyhow!("Failed to create docs/architecture directory: {}", e))?;
// Create modules and files directories
std::fs::create_dir_all(docs_arch_path.join("modules"))
.map_err(|e| anyhow::anyhow!("Failed to create modules directory: {}", e))?;
std::fs::create_dir_all(docs_arch_path.join("files"))
.map_err(|e| anyhow::anyhow!("Failed to create files directory: {}", e))?;
// Create default ARCHITECTURE.md template
let architecture_md_content = r#"# ARCHITECTURE — New Project
<!-- MANUAL:BEGIN -->
## Project summary
**Name:** New Project
**Description:** <FILL_MANUALLY: what this project does in 37 lines>
## Key decisions (manual)
- <FILL_MANUALLY>
## Non-goals (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
---
## Document metadata
- **Created:** 2026-01-25
- **Updated:** 2026-01-25
- **Generated by:** archdoc (cli) v0.1
---
## Rails / Tooling
<!-- ARCHDOC:BEGIN section=rails -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=rails -->
---
## Repository layout (top-level)
<!-- ARCHDOC:BEGIN section=layout -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=layout -->
---
## Modules index
<!-- ARCHDOC:BEGIN section=modules_index -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=modules_index -->
---
## Critical dependency points
<!-- ARCHDOC:BEGIN section=critical_points -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=critical_points -->
---
<!-- MANUAL:BEGIN -->
## Change notes (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
"#;
let architecture_md_path = std::path::Path::new(root).join("ARCHITECTURE.md");
std::fs::write(&architecture_md_path, architecture_md_content)
.map_err(|e| anyhow::anyhow!("Failed to create ARCHITECTURE.md: {}", e))?;
// Create default archdoc.toml config
let config_toml_content = r#"[project]
root = "."
out_dir = "docs/architecture"
entry_file = "ARCHITECTURE.md"
language = "python"
[scan]
include = ["src", "app", "tests"]
exclude = [
".venv", "venv", "__pycache__", ".git", "dist", "build",
".mypy_cache", ".ruff_cache", ".pytest_cache", "*.egg-info"
]
follow_symlinks = false
max_file_size = "10MB"
[python]
src_roots = ["src", "."]
include_tests = true
parse_docstrings = true
max_parse_errors = 10
[analysis]
resolve_calls = true
resolve_inheritance = false
detect_integrations = true
integration_patterns = [
{ type = "http", patterns = ["requests", "httpx", "aiohttp"] },
{ type = "db", patterns = ["sqlalchemy", "psycopg", "mysql", "sqlite3"] },
{ type = "queue", patterns = ["celery", "kafka", "pika", "redis"] }
]
[output]
single_file = false
per_file_docs = true
create_directories = true
overwrite_manual_sections = false
[diff]
update_timestamp_on_change_only = true
hash_algorithm = "sha256"
preserve_manual_content = true
[thresholds]
critical_fan_in = 20
critical_fan_out = 20
high_complexity = 50
[rendering]
template_engine = "handlebars"
max_table_rows = 100
truncate_long_descriptions = true
description_max_length = 200
[logging]
level = "info"
file = "archdoc.log"
format = "compact"
[caching]
enabled = true
cache_dir = ".archdoc/cache"
max_cache_age = "24h"
"#;
let config_toml_path = std::path::Path::new(root).join("archdoc.toml");
if !config_toml_path.exists() {
std::fs::write(&config_toml_path, config_toml_content)
.map_err(|e| anyhow::anyhow!("Failed to create archdoc.toml: {}", e))?;
}
println!("Project initialized successfully!");
println!("Created:");
println!(" - {}", architecture_md_path.display());
println!(" - {}", config_toml_path.display());
println!(" - {} (directory structure)", docs_arch_path.display());
Ok(())
}
fn load_config(config_path: &str) -> Result<Config> {
// TODO: Implement config loading
println!("Loading config from {}", config_path);
Config::load_from_file(Path::new(config_path))
.map_err(|e| anyhow::anyhow!("Failed to load config: {}", e))
}
fn analyze_project(root: &str, config: &Config) -> Result<ProjectModel> {
// TODO: Implement project analysis
println!("Analyzing project at {} with config", root);
// Initialize scanner
let scanner = FileScanner::new(config.clone());
// Scan for Python files
let python_files = scanner.scan_python_files(std::path::Path::new(root))?;
// Initialize Python analyzer
let analyzer = PythonAnalyzer::new(config.clone());
// Parse each Python file
let mut parsed_modules = Vec::new();
for file_path in python_files {
match analyzer.parse_module(&file_path) {
Ok(module) => parsed_modules.push(module),
Err(e) => {
eprintln!("Warning: Failed to parse {}: {}", file_path.display(), e);
// Continue with other files
}
}
}
// Resolve symbols and build project model
analyzer.resolve_symbols(&parsed_modules)
.map_err(|e| anyhow::anyhow!("Failed to resolve symbols: {}", e))
}
fn generate_docs(model: &ProjectModel, out: &str) -> Result<()> {
// TODO: Implement documentation generation
println!("Generating docs to {}", out);
// Initialize renderer
let renderer = archdoc_core::renderer::Renderer::new();
// Initialize writer
let writer = archdoc_core::writer::DiffAwareWriter::new();
// Write to file - ARCHITECTURE.md should be in the project root, not output directory
// The out parameter is for the docs/architecture directory structure
let output_path = std::path::Path::new(".").join("ARCHITECTURE.md");
// Render and update each section individually
// Update integrations section
match renderer.render_integrations_section(model) {
Ok(content) => {
if let Err(e) = writer.update_file_with_markers(&output_path, &content, "integrations") {
eprintln!("Warning: Failed to update integrations section: {}", e);
}
}
Err(e) => {
eprintln!("Warning: Failed to render integrations section: {}", e);
}
}
// Update rails section
match renderer.render_rails_section(model) {
Ok(content) => {
if let Err(e) = writer.update_file_with_markers(&output_path, &content, "rails") {
eprintln!("Warning: Failed to update rails section: {}", e);
}
}
Err(e) => {
eprintln!("Warning: Failed to render rails section: {}", e);
}
}
// Update layout section
match renderer.render_layout_section(model) {
Ok(content) => {
if let Err(e) = writer.update_file_with_markers(&output_path, &content, "layout") {
eprintln!("Warning: Failed to update layout section: {}", e);
}
}
Err(e) => {
eprintln!("Warning: Failed to render layout section: {}", e);
}
}
// Update modules index section
match renderer.render_modules_index_section(model) {
Ok(content) => {
if let Err(e) = writer.update_file_with_markers(&output_path, &content, "modules_index") {
eprintln!("Warning: Failed to update modules_index section: {}", e);
}
}
Err(e) => {
eprintln!("Warning: Failed to render modules_index section: {}", e);
}
}
// Update critical points section
match renderer.render_critical_points_section(model) {
Ok(content) => {
if let Err(e) = writer.update_file_with_markers(&output_path, &content, "critical_points") {
eprintln!("Warning: Failed to update critical_points section: {}", e);
}
}
Err(e) => {
eprintln!("Warning: Failed to render critical_points section: {}", e);
}
}
Ok(())
}
fn check_docs_consistency(root: &str, config: &Config) -> Result<()> {
// TODO: Implement consistency checking
println!("Checking docs consistency for project at {} with config", root);
// Analyze project
let model = analyze_project(root, config)?;
// Generate documentation content - if this succeeds, the analysis is working
let renderer = archdoc_core::renderer::Renderer::new();
let generated_architecture_md = renderer.render_architecture_md(&model)?;
// Read existing documentation
let architecture_md_path = std::path::Path::new(root).join(&config.project.entry_file);
if !architecture_md_path.exists() {
return Err(anyhow::anyhow!("Documentation file {} does not exist", architecture_md_path.display()));
}
let existing_architecture_md = std::fs::read_to_string(&architecture_md_path)
.map_err(|e| anyhow::anyhow!("Failed to read {}: {}", architecture_md_path.display(), e))?;
// For V1, we'll just check that we can generate content without errors
// A full implementation would compare only the generated sections
println!("Documentation analysis successful - project can be documented");
println!("Generated content length: {}", generated_architecture_md.len());
println!("Existing content length: {}", existing_architecture_md.len());
Ok(())
}

12
archdoc-core/.gitignore vendored Normal file
View File

@@ -0,0 +1,12 @@
# Compiled files
target/
# IDE files
*.swp
.DS_Store
# Backup files
*.rs.bk
# Documentation files
doc/

1320
archdoc-core/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

18
archdoc-core/Cargo.toml Normal file
View File

@@ -0,0 +1,18 @@
[package]
name = "archdoc-core"
version = "0.1.0"
edition = "2024"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
toml = "0.9.11+spec-1.1.0"
tracing = "0.1"
anyhow = "1.0"
thiserror = "2.0.18"
walkdir = "2.3"
handlebars = "6.4.0"
rustpython-parser = "0.4"
rustpython-ast = "0.4"
chrono = { version = "0.4", features = ["serde"] }
tempfile = "3.10"

168
archdoc-core/src/cache.rs Normal file
View File

@@ -0,0 +1,168 @@
//! Caching module for ArchDoc
//!
//! This module provides caching functionality to speed up repeated analysis
//! by storing parsed ASTs and analysis results.
use crate::config::Config;
use crate::errors::ArchDocError;
use crate::model::ParsedModule;
use std::path::Path;
use std::fs;
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};
#[derive(Debug, Serialize, Deserialize)]
struct CacheEntry {
/// Timestamp when the cache entry was created
created_at: DateTime<Utc>,
/// Timestamp when the source file was last modified
file_modified_at: DateTime<Utc>,
/// The parsed module data
parsed_module: ParsedModule,
}
pub struct CacheManager {
config: Config,
cache_dir: String,
}
impl CacheManager {
pub fn new(config: Config) -> Self {
let cache_dir = config.caching.cache_dir.clone();
// Create cache directory if it doesn't exist
if config.caching.enabled && !Path::new(&cache_dir).exists() {
let _ = fs::create_dir_all(&cache_dir);
}
Self { config, cache_dir }
}
/// Get cached parsed module if available and not expired
pub fn get_cached_module(&self, file_path: &Path) -> Result<Option<ParsedModule>, ArchDocError> {
if !self.config.caching.enabled {
return Ok(None);
}
let cache_key = self.get_cache_key(file_path);
let cache_file = Path::new(&self.cache_dir).join(&cache_key);
if !cache_file.exists() {
return Ok(None);
}
// Read cache file
let content = fs::read_to_string(&cache_file)
.map_err(|e| ArchDocError::Io(e))?;
let cache_entry: CacheEntry = serde_json::from_str(&content)
.map_err(|e| ArchDocError::AnalysisError(format!("Failed to deserialize cache entry: {}", e)))?;
// Check if cache is expired
let now = Utc::now();
let cache_age = now.signed_duration_since(cache_entry.created_at);
// Parse max_cache_age (simple format: "24h", "7d", etc.)
let max_age_seconds = self.parse_duration(&self.config.caching.max_cache_age)?;
if cache_age.num_seconds() > max_age_seconds as i64 {
// Cache expired, remove it
let _ = fs::remove_file(&cache_file);
return Ok(None);
}
// Check if source file has been modified since caching
let metadata = fs::metadata(file_path)
.map_err(|e| ArchDocError::Io(e))?;
let modified_time = metadata.modified()
.map_err(|e| ArchDocError::Io(e))?;
let modified_time: DateTime<Utc> = modified_time.into();
if modified_time > cache_entry.file_modified_at {
// Source file is newer than cache, invalidate cache
let _ = fs::remove_file(&cache_file);
return Ok(None);
}
Ok(Some(cache_entry.parsed_module))
}
/// Store parsed module in cache
pub fn store_module(&self, file_path: &Path, parsed_module: ParsedModule) -> Result<(), ArchDocError> {
if !self.config.caching.enabled {
return Ok(());
}
let cache_key = self.get_cache_key(file_path);
let cache_file = Path::new(&self.cache_dir).join(&cache_key);
// Get file modification time
let metadata = fs::metadata(file_path)
.map_err(|e| ArchDocError::Io(e))?;
let modified_time = metadata.modified()
.map_err(|e| ArchDocError::Io(e))?;
let modified_time: DateTime<Utc> = modified_time.into();
let cache_entry = CacheEntry {
created_at: Utc::now(),
file_modified_at: modified_time,
parsed_module,
};
let content = serde_json::to_string(&cache_entry)
.map_err(|e| ArchDocError::AnalysisError(format!("Failed to serialize cache entry: {}", e)))?;
fs::write(&cache_file, content)
.map_err(|e| ArchDocError::Io(e))
}
/// Generate cache key for a file path
fn get_cache_key(&self, file_path: &Path) -> String {
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};
let mut hasher = DefaultHasher::new();
file_path.hash(&mut hasher);
let hash = hasher.finish();
format!("{:x}.json", hash)
}
/// Parse duration string like "24h" or "7d" into seconds
fn parse_duration(&self, duration_str: &str) -> Result<u64, ArchDocError> {
if duration_str.is_empty() {
return Ok(0);
}
let chars: Vec<char> = duration_str.chars().collect();
let (number_str, unit) = chars.split_at(chars.len() - 1);
let number: u64 = number_str.iter().collect::<String>().parse()
.map_err(|_| ArchDocError::AnalysisError(format!("Invalid duration format: {}", duration_str)))?;
match unit[0] {
's' => Ok(number), // seconds
'm' => Ok(number * 60), // minutes
'h' => Ok(number * 3600), // hours
'd' => Ok(number * 86400), // days
_ => Err(ArchDocError::AnalysisError(format!("Unknown duration unit: {}", unit[0]))),
}
}
/// Clear all cache entries
pub fn clear_cache(&self) -> Result<(), ArchDocError> {
if Path::new(&self.cache_dir).exists() {
fs::remove_dir_all(&self.cache_dir)
.map_err(|e| ArchDocError::Io(e))?;
// Recreate cache directory
fs::create_dir_all(&self.cache_dir)
.map_err(|e| ArchDocError::Io(e))?;
}
Ok(())
}
}

458
archdoc-core/src/config.rs Normal file
View File

@@ -0,0 +1,458 @@
//! Configuration management for ArchDoc
//!
//! This module handles loading and validating the archdoc.toml configuration file.
use serde::{Deserialize, Serialize};
use std::path::Path;
use crate::errors::ArchDocError;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Config {
#[serde(default)]
pub project: ProjectConfig,
#[serde(default)]
pub scan: ScanConfig,
#[serde(default)]
pub python: PythonConfig,
#[serde(default)]
pub analysis: AnalysisConfig,
#[serde(default)]
pub output: OutputConfig,
#[serde(default)]
pub diff: DiffConfig,
#[serde(default)]
pub thresholds: ThresholdsConfig,
#[serde(default)]
pub rendering: RenderingConfig,
#[serde(default)]
pub logging: LoggingConfig,
#[serde(default)]
pub caching: CachingConfig,
}
impl Default for Config {
fn default() -> Self {
Self {
project: ProjectConfig::default(),
scan: ScanConfig::default(),
python: PythonConfig::default(),
analysis: AnalysisConfig::default(),
output: OutputConfig::default(),
diff: DiffConfig::default(),
thresholds: ThresholdsConfig::default(),
rendering: RenderingConfig::default(),
logging: LoggingConfig::default(),
caching: CachingConfig::default(),
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ProjectConfig {
#[serde(default = "default_root")]
pub root: String,
#[serde(default = "default_out_dir")]
pub out_dir: String,
#[serde(default = "default_entry_file")]
pub entry_file: String,
#[serde(default = "default_language")]
pub language: String,
#[serde(default)]
pub name: String,
}
impl Default for ProjectConfig {
fn default() -> Self {
Self {
root: default_root(),
out_dir: default_out_dir(),
entry_file: default_entry_file(),
language: default_language(),
name: String::new(),
}
}
}
fn default_root() -> String {
".".to_string()
}
fn default_out_dir() -> String {
"docs/architecture".to_string()
}
fn default_entry_file() -> String {
"ARCHITECTURE.md".to_string()
}
fn default_language() -> String {
"python".to_string()
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ScanConfig {
#[serde(default = "default_include")]
pub include: Vec<String>,
#[serde(default = "default_exclude")]
pub exclude: Vec<String>,
#[serde(default)]
pub follow_symlinks: bool,
#[serde(default = "default_max_file_size")]
pub max_file_size: String,
}
impl Default for ScanConfig {
fn default() -> Self {
Self {
include: default_include(),
exclude: default_exclude(),
follow_symlinks: false,
max_file_size: default_max_file_size(),
}
}
}
fn default_include() -> Vec<String> {
vec!["src".to_string(), "app".to_string(), "tests".to_string()]
}
fn default_exclude() -> Vec<String> {
vec![
".venv".to_string(),
"venv".to_string(),
"__pycache__".to_string(),
".git".to_string(),
"dist".to_string(),
"build".to_string(),
".mypy_cache".to_string(),
".ruff_cache".to_string(),
".pytest_cache".to_string(),
"*.egg-info".to_string(),
]
}
fn default_max_file_size() -> String {
"10MB".to_string()
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PythonConfig {
#[serde(default = "default_src_roots")]
pub src_roots: Vec<String>,
#[serde(default = "default_include_tests")]
pub include_tests: bool,
#[serde(default = "default_parse_docstrings")]
pub parse_docstrings: bool,
#[serde(default = "default_max_parse_errors")]
pub max_parse_errors: usize,
}
impl Default for PythonConfig {
fn default() -> Self {
Self {
src_roots: default_src_roots(),
include_tests: default_include_tests(),
parse_docstrings: default_parse_docstrings(),
max_parse_errors: default_max_parse_errors(),
}
}
}
fn default_src_roots() -> Vec<String> {
vec!["src".to_string(), ".".to_string()]
}
fn default_include_tests() -> bool {
true
}
fn default_parse_docstrings() -> bool {
true
}
fn default_max_parse_errors() -> usize {
10
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AnalysisConfig {
#[serde(default = "default_resolve_calls")]
pub resolve_calls: bool,
#[serde(default)]
pub resolve_inheritance: bool,
#[serde(default = "default_detect_integrations")]
pub detect_integrations: bool,
#[serde(default = "default_integration_patterns")]
pub integration_patterns: Vec<IntegrationPattern>,
}
impl Default for AnalysisConfig {
fn default() -> Self {
Self {
resolve_calls: default_resolve_calls(),
resolve_inheritance: false,
detect_integrations: default_detect_integrations(),
integration_patterns: default_integration_patterns(),
}
}
}
fn default_resolve_calls() -> bool {
true
}
fn default_detect_integrations() -> bool {
true
}
fn default_integration_patterns() -> Vec<IntegrationPattern> {
vec![
IntegrationPattern {
type_: "http".to_string(),
patterns: vec!["requests".to_string(), "httpx".to_string(), "aiohttp".to_string()],
},
IntegrationPattern {
type_: "db".to_string(),
patterns: vec![
"sqlalchemy".to_string(),
"psycopg".to_string(),
"mysql".to_string(),
"sqlite3".to_string(),
],
},
IntegrationPattern {
type_: "queue".to_string(),
patterns: vec![
"celery".to_string(),
"kafka".to_string(),
"pika".to_string(),
"redis".to_string(),
],
},
]
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct IntegrationPattern {
#[serde(rename = "type")]
pub type_: String,
pub patterns: Vec<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct OutputConfig {
#[serde(default)]
pub single_file: bool,
#[serde(default = "default_per_file_docs")]
pub per_file_docs: bool,
#[serde(default = "default_create_directories")]
pub create_directories: bool,
#[serde(default)]
pub overwrite_manual_sections: bool,
}
impl Default for OutputConfig {
fn default() -> Self {
Self {
single_file: false,
per_file_docs: default_per_file_docs(),
create_directories: default_create_directories(),
overwrite_manual_sections: false,
}
}
}
fn default_per_file_docs() -> bool {
true
}
fn default_create_directories() -> bool {
true
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DiffConfig {
#[serde(default = "default_update_timestamp_on_change_only")]
pub update_timestamp_on_change_only: bool,
#[serde(default = "default_hash_algorithm")]
pub hash_algorithm: String,
#[serde(default = "default_preserve_manual_content")]
pub preserve_manual_content: bool,
}
impl Default for DiffConfig {
fn default() -> Self {
Self {
update_timestamp_on_change_only: default_update_timestamp_on_change_only(),
hash_algorithm: default_hash_algorithm(),
preserve_manual_content: default_preserve_manual_content(),
}
}
}
fn default_update_timestamp_on_change_only() -> bool {
true
}
fn default_hash_algorithm() -> String {
"sha256".to_string()
}
fn default_preserve_manual_content() -> bool {
true
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ThresholdsConfig {
#[serde(default = "default_critical_fan_in")]
pub critical_fan_in: usize,
#[serde(default = "default_critical_fan_out")]
pub critical_fan_out: usize,
#[serde(default = "default_high_complexity")]
pub high_complexity: usize,
}
impl Default for ThresholdsConfig {
fn default() -> Self {
Self {
critical_fan_in: default_critical_fan_in(),
critical_fan_out: default_critical_fan_out(),
high_complexity: default_high_complexity(),
}
}
}
fn default_critical_fan_in() -> usize {
20
}
fn default_critical_fan_out() -> usize {
20
}
fn default_high_complexity() -> usize {
50
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RenderingConfig {
#[serde(default = "default_template_engine")]
pub template_engine: String,
#[serde(default = "default_max_table_rows")]
pub max_table_rows: usize,
#[serde(default = "default_truncate_long_descriptions")]
pub truncate_long_descriptions: bool,
#[serde(default = "default_description_max_length")]
pub description_max_length: usize,
}
impl Default for RenderingConfig {
fn default() -> Self {
Self {
template_engine: default_template_engine(),
max_table_rows: default_max_table_rows(),
truncate_long_descriptions: default_truncate_long_descriptions(),
description_max_length: default_description_max_length(),
}
}
}
fn default_template_engine() -> String {
"handlebars".to_string()
}
fn default_max_table_rows() -> usize {
100
}
fn default_truncate_long_descriptions() -> bool {
true
}
fn default_description_max_length() -> usize {
200
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LoggingConfig {
#[serde(default = "default_log_level")]
pub level: String,
#[serde(default = "default_log_file")]
pub file: String,
#[serde(default = "default_log_format")]
pub format: String,
}
impl Default for LoggingConfig {
fn default() -> Self {
Self {
level: default_log_level(),
file: default_log_file(),
format: default_log_format(),
}
}
}
fn default_log_level() -> String {
"info".to_string()
}
fn default_log_file() -> String {
"archdoc.log".to_string()
}
fn default_log_format() -> String {
"compact".to_string()
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CachingConfig {
#[serde(default = "default_caching_enabled")]
pub enabled: bool,
#[serde(default = "default_cache_dir")]
pub cache_dir: String,
#[serde(default = "default_max_cache_age")]
pub max_cache_age: String,
}
impl Default for CachingConfig {
fn default() -> Self {
Self {
enabled: default_caching_enabled(),
cache_dir: default_cache_dir(),
max_cache_age: default_max_cache_age(),
}
}
}
fn default_caching_enabled() -> bool {
true
}
fn default_cache_dir() -> String {
".archdoc/cache".to_string()
}
fn default_max_cache_age() -> String {
"24h".to_string()
}
impl Config {
/// Load configuration from a TOML file
pub fn load_from_file(path: &Path) -> Result<Self, ArchDocError> {
let content = std::fs::read_to_string(path)
.map_err(|e| ArchDocError::ConfigError(format!("Failed to read config file: {}", e)))?;
toml::from_str(&content)
.map_err(|e| ArchDocError::ConfigError(format!("Failed to parse config file: {}", e)))
}
/// Save configuration to a TOML file
pub fn save_to_file(&self, path: &Path) -> Result<(), ArchDocError> {
let content = toml::to_string_pretty(self)
.map_err(|e| ArchDocError::ConfigError(format!("Failed to serialize config: {}", e)))?;
std::fs::write(path, content)
.map_err(|e| ArchDocError::ConfigError(format!("Failed to write config file: {}", e)))
}
}

View File

@@ -0,0 +1,26 @@
use thiserror::Error;
#[derive(Error, Debug)]
pub enum ArchDocError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Parse error in {file}:{line}: {message}")]
ParseError {
file: String,
line: usize,
message: String,
},
#[error("Configuration error: {0}")]
ConfigError(String),
#[error("Analysis error: {0}")]
AnalysisError(String),
#[error("Rendering error: {0}")]
RenderingError(String),
#[error("File consistency check failed: {0}")]
ConsistencyError(String),
}

31
archdoc-core/src/lib.rs Normal file
View File

@@ -0,0 +1,31 @@
//! ArchDoc Core Library
//!
//! This crate provides the core functionality for analyzing Python projects
//! and generating architecture documentation.
// Public modules
pub mod errors;
pub mod config;
pub mod model;
pub mod scanner;
pub mod python_analyzer;
pub mod renderer;
pub mod writer;
pub mod cache;
// Re-export commonly used types
pub use errors::ArchDocError;
pub use config::Config;
pub use model::ProjectModel;
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_works() {
let result = 2 + 2;
assert_eq!(result, 4);
}
}

168
archdoc-core/src/model.rs Normal file
View File

@@ -0,0 +1,168 @@
//! Intermediate Representation (IR) for ArchDoc
//!
//! This module defines the data structures that represent the analyzed Python project
//! and are used for generating documentation.
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ProjectModel {
pub modules: HashMap<String, Module>,
pub files: HashMap<String, FileDoc>,
pub symbols: HashMap<String, Symbol>,
pub edges: Edges,
}
impl ProjectModel {
pub fn new() -> Self {
Self {
modules: HashMap::new(),
files: HashMap::new(),
symbols: HashMap::new(),
edges: Edges::new(),
}
}
}
impl Default for ProjectModel {
fn default() -> Self {
Self::new()
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Module {
pub id: String,
pub path: String,
pub files: Vec<String>,
pub doc_summary: Option<String>,
pub outbound_modules: Vec<String>,
pub inbound_modules: Vec<String>,
pub symbols: Vec<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FileDoc {
pub id: String,
pub path: String,
pub module_id: String,
pub imports: Vec<String>, // normalized import strings
pub outbound_modules: Vec<String>,
pub inbound_files: Vec<String>,
pub symbols: Vec<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Symbol {
pub id: String,
pub kind: SymbolKind,
pub module_id: String,
pub file_id: String,
pub qualname: String,
pub signature: String,
pub annotations: Option<HashMap<String, String>>,
pub docstring_first_line: Option<String>,
pub purpose: String, // docstring or heuristic
pub outbound_calls: Vec<String>,
pub inbound_calls: Vec<String>,
pub integrations_flags: IntegrationFlags,
pub metrics: SymbolMetrics,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub enum SymbolKind {
Function,
AsyncFunction,
Class,
Method,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct IntegrationFlags {
pub http: bool,
pub db: bool,
pub queue: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SymbolMetrics {
pub fan_in: usize,
pub fan_out: usize,
pub is_critical: bool,
pub cycle_participant: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Edges {
pub module_import_edges: Vec<Edge>,
pub file_import_edges: Vec<Edge>,
pub symbol_call_edges: Vec<Edge>,
}
impl Edges {
pub fn new() -> Self {
Self {
module_import_edges: Vec::new(),
file_import_edges: Vec::new(),
symbol_call_edges: Vec::new(),
}
}
}
impl Default for Edges {
fn default() -> Self {
Self::new()
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Edge {
pub from_id: String,
pub to_id: String,
pub edge_type: EdgeType,
pub meta: Option<HashMap<String, String>>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum EdgeType {
ModuleImport,
FileImport,
SymbolCall,
ExternalCall,
UnresolvedCall,
}
// Additional structures for Python analysis
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ParsedModule {
pub path: std::path::PathBuf,
pub module_path: String,
pub imports: Vec<Import>,
pub symbols: Vec<Symbol>,
pub calls: Vec<Call>,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct Import {
pub module_name: String,
pub alias: Option<String>,
pub line_number: usize,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct Call {
pub caller_symbol: String,
pub callee_expr: String,
pub line_number: usize,
pub call_type: CallType,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub enum CallType {
Local,
Imported,
External,
Unresolved,
}

View File

@@ -0,0 +1,386 @@
//! Python AST analyzer for ArchDoc
//!
//! This module handles parsing Python files using AST and extracting
//! imports, definitions, and calls.
use crate::model::{ParsedModule, ProjectModel, Import, Call, CallType, Symbol, Module, FileDoc};
use crate::config::Config;
use crate::errors::ArchDocError;
use crate::cache::CacheManager;
use std::path::Path;
use std::fs;
use rustpython_parser::{ast, Parse};
use rustpython_ast::{Stmt, StmtClassDef, StmtFunctionDef, Expr, Ranged};
pub struct PythonAnalyzer {
_config: Config,
cache_manager: CacheManager,
}
impl PythonAnalyzer {
pub fn new(config: Config) -> Self {
let cache_manager = CacheManager::new(config.clone());
Self { _config: config, cache_manager }
}
pub fn parse_module(&self, file_path: &Path) -> Result<ParsedModule, ArchDocError> {
// Try to get from cache first
if let Some(cached_module) = self.cache_manager.get_cached_module(file_path)? {
return Ok(cached_module);
}
// Read the Python file
let code = fs::read_to_string(file_path)
.map_err(ArchDocError::Io)?;
// Parse the Python code into an AST
let ast = ast::Suite::parse(&code, file_path.to_str().unwrap_or("<unknown>"))
.map_err(|e| ArchDocError::ParseError {
file: file_path.to_string_lossy().to_string(),
line: 0, // We don't have line info from the error
message: format!("Failed to parse: {}", e),
})?;
// Extract imports, definitions, and calls
let mut imports = Vec::new();
let mut symbols = Vec::new();
let mut calls = Vec::new();
for stmt in ast {
self.extract_from_statement(&stmt, None, &mut imports, &mut symbols, &mut calls, 0);
}
let parsed_module = ParsedModule {
path: file_path.to_path_buf(),
module_path: file_path.to_string_lossy().to_string(),
imports,
symbols,
calls,
};
// Store in cache
self.cache_manager.store_module(file_path, parsed_module.clone())?;
Ok(parsed_module)
}
fn extract_from_statement(&self, stmt: &Stmt, current_symbol: Option<&str>, imports: &mut Vec<Import>, symbols: &mut Vec<Symbol>, calls: &mut Vec<Call>, depth: usize) {
match stmt {
Stmt::Import(import_stmt) => {
for alias in &import_stmt.names {
imports.push(Import {
module_name: alias.name.to_string(),
alias: alias.asname.as_ref().map(|n| n.to_string()),
line_number: alias.range().start().into(),
});
}
}
Stmt::ImportFrom(import_from_stmt) => {
let module_name = import_from_stmt.module.as_ref()
.map(|m| m.to_string())
.unwrap_or_default();
for alias in &import_from_stmt.names {
let full_name = if module_name.is_empty() {
alias.name.to_string()
} else {
format!("{}.{}", module_name, alias.name)
};
imports.push(Import {
module_name: full_name,
alias: alias.asname.as_ref().map(|n| n.to_string()),
line_number: alias.range().start().into(),
});
}
}
Stmt::FunctionDef(func_def) => {
// Extract function definition
// Create a symbol for this function
let integrations_flags = self.detect_integrations(&func_def.body, &self._config);
let symbol = Symbol {
id: func_def.name.to_string(),
kind: crate::model::SymbolKind::Function,
module_id: "".to_string(), // Will be filled later
file_id: "".to_string(), // Will be filled later
qualname: func_def.name.to_string(),
signature: format!("def {}(...)", func_def.name),
annotations: None,
docstring_first_line: self.extract_docstring(&func_def.body), // Extract docstring
purpose: "extracted from AST".to_string(),
outbound_calls: Vec::new(),
inbound_calls: Vec::new(),
integrations_flags,
metrics: crate::model::SymbolMetrics {
fan_in: 0,
fan_out: 0,
is_critical: false,
cycle_participant: false,
},
};
symbols.push(symbol);
// Recursively process function body for calls
for body_stmt in &func_def.body {
self.extract_from_statement(body_stmt, Some(&func_def.name), imports, symbols, calls, depth + 1);
}
}
Stmt::ClassDef(class_def) => {
// Extract class definition
// Create a symbol for this class
let integrations_flags = self.detect_integrations(&class_def.body, &self._config);
let symbol = Symbol {
id: class_def.name.to_string(),
kind: crate::model::SymbolKind::Class,
module_id: "".to_string(), // Will be filled later
file_id: "".to_string(), // Will be filled later
qualname: class_def.name.to_string(),
signature: format!("class {}", class_def.name),
annotations: None,
docstring_first_line: self.extract_docstring(&class_def.body), // Extract docstring
purpose: "extracted from AST".to_string(),
outbound_calls: Vec::new(),
inbound_calls: Vec::new(),
integrations_flags,
metrics: crate::model::SymbolMetrics {
fan_in: 0,
fan_out: 0,
is_critical: false,
cycle_participant: false,
},
};
symbols.push(symbol);
// Recursively process class body
for body_stmt in &class_def.body {
self.extract_from_statement(body_stmt, Some(&class_def.name), imports, symbols, calls, depth + 1);
}
}
Stmt::Expr(expr_stmt) => {
self.extract_from_expression(&expr_stmt.value, current_symbol, calls);
}
_ => {
// For other statement types, we might still need to check for calls in expressions
// This is a simplified approach - a full implementation would need to traverse all expressions
}
}
}
fn extract_docstring(&self, body: &[Stmt]) -> Option<String> {
// For now, just return None until we figure out the correct way to extract docstrings
// TODO: Implement proper docstring extraction
None
}
fn detect_integrations(&self, body: &[Stmt], config: &Config) -> crate::model::IntegrationFlags {
let mut flags = crate::model::IntegrationFlags {
http: false,
db: false,
queue: false,
};
if !config.analysis.detect_integrations {
return flags;
}
// Convert body to string for pattern matching
let body_str = format!("{:?}", body);
// Check for HTTP integrations
for pattern in &config.analysis.integration_patterns {
if pattern.type_ == "http" {
for lib in &pattern.patterns {
if body_str.contains(lib) {
flags.http = true;
break;
}
}
} else if pattern.type_ == "db" {
for lib in &pattern.patterns {
if body_str.contains(lib) {
flags.db = true;
break;
}
}
} else if pattern.type_ == "queue" {
for lib in &pattern.patterns {
if body_str.contains(lib) {
flags.queue = true;
break;
}
}
}
}
flags
}
fn extract_function_def(&self, _func_def: &StmtFunctionDef, _symbols: &mut Vec<Symbol>, _calls: &mut Vec<Call>, _depth: usize) {
// Extract function information
// This is a simplified implementation - a full implementation would extract more details
}
fn extract_class_def(&self, _class_def: &StmtClassDef, _symbols: &mut Vec<Symbol>, _depth: usize) {
// Extract class information
// This is a simplified implementation - a full implementation would extract more details
}
fn extract_from_expression(&self, expr: &Expr, current_symbol: Option<&str>, calls: &mut Vec<Call>) {
match expr {
Expr::Call(call_expr) => {
// Extract call information
let callee_expr = self.expr_to_string(&call_expr.func);
calls.push(Call {
caller_symbol: current_symbol.unwrap_or("unknown").to_string(), // Use current symbol as caller
callee_expr,
line_number: call_expr.range().start().into(),
call_type: CallType::Unresolved,
});
// Recursively process arguments
for arg in &call_expr.args {
self.extract_from_expression(arg, current_symbol, calls);
}
for keyword in &call_expr.keywords {
self.extract_from_expression(&keyword.value, current_symbol, calls);
}
}
Expr::Attribute(attr_expr) => {
// Recursively process value
self.extract_from_expression(&attr_expr.value, current_symbol, calls);
}
_ => {
// For other expression types, recursively process child expressions
// This is a simplified approach - a full implementation would handle all expression variants
}
}
}
fn expr_to_string(&self, expr: &Expr) -> String {
match expr {
Expr::Name(name_expr) => name_expr.id.to_string(),
Expr::Attribute(attr_expr) => {
format!("{}.{}", self.expr_to_string(&attr_expr.value), attr_expr.attr)
}
_ => "<complex_expression>".to_string(),
}
}
pub fn resolve_symbols(&self, modules: &[ParsedModule]) -> Result<ProjectModel, ArchDocError> {
// Build symbol index
// Resolve cross-module references
// Build call graph
// This is a simplified implementation that creates a basic project model
// A full implementation would do much more sophisticated symbol resolution
let mut project_model = ProjectModel::new();
// Add modules to project model
for parsed_module in modules {
let module_id = parsed_module.module_path.clone();
let file_id = parsed_module.path.to_string_lossy().to_string();
// Create file doc
let file_doc = FileDoc {
id: file_id.clone(),
path: parsed_module.path.to_string_lossy().to_string(),
module_id: module_id.clone(),
imports: parsed_module.imports.iter().map(|i| i.module_name.clone()).collect(),
outbound_modules: Vec::new(), // TODO: Resolve outbound modules
inbound_files: Vec::new(),
symbols: parsed_module.symbols.iter().map(|s| s.id.clone()).collect(),
};
project_model.files.insert(file_id.clone(), file_doc);
// Add symbols to project model
for mut symbol in parsed_module.symbols.clone() {
symbol.module_id = module_id.clone();
symbol.file_id = file_id.clone();
project_model.symbols.insert(symbol.id.clone(), symbol);
}
// Create module
let module = Module {
id: module_id.clone(),
path: parsed_module.path.to_string_lossy().to_string(),
files: vec![file_id.clone()],
doc_summary: None,
outbound_modules: Vec::new(), // TODO: Resolve outbound modules
inbound_modules: Vec::new(),
symbols: parsed_module.symbols.iter().map(|s| s.id.clone()).collect(),
};
project_model.modules.insert(module_id, module);
}
// Build dependency graphs and compute metrics
self.build_dependency_graphs(&mut project_model, modules)?;
self.compute_metrics(&mut project_model)?;
Ok(project_model)
}
fn build_dependency_graphs(&self, project_model: &mut ProjectModel, parsed_modules: &[ParsedModule]) -> Result<(), ArchDocError> {
// Build module import edges
for parsed_module in parsed_modules {
let from_module_id = parsed_module.module_path.clone();
for import in &parsed_module.imports {
// Try to resolve the imported module
let to_module_id = import.module_name.clone();
// Create module import edge
let edge = crate::model::Edge {
from_id: from_module_id.clone(),
to_id: to_module_id,
edge_type: crate::model::EdgeType::ModuleImport,
meta: None,
};
project_model.edges.module_import_edges.push(edge);
}
}
// Build symbol call edges
for parsed_module in parsed_modules {
let _module_id = parsed_module.module_path.clone();
for call in &parsed_module.calls {
// Try to resolve the called symbol
let callee_expr = call.callee_expr.clone();
// Create symbol call edge
let edge = crate::model::Edge {
from_id: call.caller_symbol.clone(),
to_id: callee_expr,
edge_type: crate::model::EdgeType::SymbolCall, // TODO: Map CallType to EdgeType properly
meta: None,
};
project_model.edges.symbol_call_edges.push(edge);
}
}
Ok(())
}
fn compute_metrics(&self, project_model: &mut ProjectModel) -> Result<(), ArchDocError> {
// Compute fan-in and fan-out metrics for symbols
for symbol in project_model.symbols.values_mut() {
// Fan-out: count of outgoing calls
let fan_out = project_model.edges.symbol_call_edges
.iter()
.filter(|edge| edge.from_id == symbol.id)
.count();
// Fan-in: count of incoming calls
let fan_in = project_model.edges.symbol_call_edges
.iter()
.filter(|edge| edge.to_id == symbol.id)
.count();
symbol.metrics.fan_in = fan_in;
symbol.metrics.fan_out = fan_out;
symbol.metrics.is_critical = fan_in > 10 || fan_out > 10; // Simple threshold
symbol.metrics.cycle_participant = false; // TODO: Detect cycles
}
Ok(())
}
}

View File

@@ -0,0 +1,369 @@
//! Markdown renderer for ArchDoc
//!
//! This module handles generating Markdown documentation from the project model
//! using templates.
use crate::model::ProjectModel;
use handlebars::Handlebars;
pub struct Renderer {
templates: Handlebars<'static>,
}
impl Renderer {
pub fn new() -> Self {
let mut handlebars = Handlebars::new();
// Register templates
handlebars.register_template_string("architecture_md", Self::architecture_md_template())
.expect("Failed to register architecture_md template");
// TODO: Register other templates
Self {
templates: handlebars,
}
}
fn architecture_md_template() -> &'static str {
r#"# ARCHITECTURE — {{{project_name}}}
<!-- MANUAL:BEGIN -->
## Project summary
**Name:** {{{project_name}}}
**Description:** {{{project_description}}}
## Key decisions (manual)
{{#each key_decisions}}
- {{{this}}}
{{/each}}
## Non-goals (manual)
{{#each non_goals}}
- {{{this}}}
{{/each}}
<!-- MANUAL:END -->
---
## Document metadata
- **Created:** {{{created_date}}}
- **Updated:** {{{updated_date}}}
- **Generated by:** archdoc (cli) v0.1
---
## Integrations
<!-- ARCHDOC:BEGIN section=integrations -->
> Generated. Do not edit inside this block.
### Database Integrations
{{#each db_integrations}}
- {{{this}}}
{{/each}}
### HTTP/API Integrations
{{#each http_integrations}}
- {{{this}}}
{{/each}}
### Queue Integrations
{{#each queue_integrations}}
- {{{this}}}
{{/each}}
<!-- ARCHDOC:END section=integrations -->
---
## Rails / Tooling
<!-- ARCHDOC:BEGIN section=rails -->
> Generated. Do not edit inside this block.
{{{rails_summary}}}
<!-- ARCHDOC:END section=rails -->
---
## Repository layout (top-level)
<!-- ARCHDOC:BEGIN section=layout -->
> Generated. Do not edit inside this block.
| Path | Purpose | Link |
|------|---------|------|
{{#each layout_items}}
| {{{path}}} | {{{purpose}}} | [details]({{{link}}}) |
{{/each}}
<!-- ARCHDOC:END section=layout -->
---
## Modules index
<!-- ARCHDOC:BEGIN section=modules_index -->
> Generated. Do not edit inside this block.
| Module | Symbols | Inbound | Outbound | Link |
|--------|---------|---------|----------|------|
{{#each modules}}
| {{{name}}} | {{{symbol_count}}} | {{{inbound_count}}} | {{{outbound_count}}} | [details]({{{link}}}) |
{{/each}}
<!-- ARCHDOC:END section=modules_index -->
---
## Critical dependency points
<!-- ARCHDOC:BEGIN section=critical_points -->
> Generated. Do not edit inside this block.
### High Fan-in (Most Called)
| Symbol | Fan-in | Critical |
|--------|--------|----------|
{{#each high_fan_in}}
| {{{symbol}}} | {{{count}}} | {{{critical}}} |
{{/each}}
### High Fan-out (Calls Many)
| Symbol | Fan-out | Critical |
|--------|---------|----------|
{{#each high_fan_out}}
| {{{symbol}}} | {{{count}}} | {{{critical}}} |
{{/each}}
### Module Cycles
{{#each cycles}}
- {{{cycle_path}}}
{{/each}}
<!-- ARCHDOC:END section=critical_points -->
---
<!-- MANUAL:BEGIN -->
## Change notes (manual)
{{#each change_notes}}
- {{{this}}}
{{/each}}
<!-- MANUAL:END -->
"#
}
pub fn render_architecture_md(&self, model: &ProjectModel) -> Result<String, anyhow::Error> {
// Collect integration information
let mut db_integrations = Vec::new();
let mut http_integrations = Vec::new();
let mut queue_integrations = Vec::new();
for (symbol_id, symbol) in &model.symbols {
if symbol.integrations_flags.db {
db_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
if symbol.integrations_flags.http {
http_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
if symbol.integrations_flags.queue {
queue_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
}
// Prepare data for template
let data = serde_json::json!({
"project_name": "New Project",
"project_description": "<FILL_MANUALLY: what this project does in 37 lines>",
"created_date": "2026-01-25",
"updated_date": "2026-01-25",
"key_decisions": ["<FILL_MANUALLY>"],
"non_goals": ["<FILL_MANUALLY>"],
"change_notes": ["<FILL_MANUALLY>"],
"db_integrations": db_integrations,
"http_integrations": http_integrations,
"queue_integrations": queue_integrations,
// TODO: Fill with more actual data from model
});
self.templates.render("architecture_md", &data)
.map_err(|e| anyhow::anyhow!("Failed to render architecture.md: {}", e))
}
pub fn render_integrations_section(&self, model: &ProjectModel) -> Result<String, anyhow::Error> {
// Collect integration information
let mut db_integrations = Vec::new();
let mut http_integrations = Vec::new();
let mut queue_integrations = Vec::new();
for (symbol_id, symbol) in &model.symbols {
if symbol.integrations_flags.db {
db_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
if symbol.integrations_flags.http {
http_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
if symbol.integrations_flags.queue {
queue_integrations.push(format!("{} in {}", symbol_id, symbol.file_id));
}
}
// Prepare data for integrations section
let data = serde_json::json!({
"db_integrations": db_integrations,
"http_integrations": http_integrations,
"queue_integrations": queue_integrations,
});
// Create a smaller template just for the integrations section
let integrations_template = r#"
### Database Integrations
{{#each db_integrations}}
- {{{this}}}
{{/each}}
### HTTP/API Integrations
{{#each http_integrations}}
- {{{this}}}
{{/each}}
### Queue Integrations
{{#each queue_integrations}}
- {{{this}}}
{{/each}}
"#;
let mut handlebars = Handlebars::new();
handlebars.register_template_string("integrations", integrations_template)
.map_err(|e| anyhow::anyhow!("Failed to register integrations template: {}", e))?;
handlebars.render("integrations", &data)
.map_err(|e| anyhow::anyhow!("Failed to render integrations section: {}", e))
}
pub fn render_rails_section(&self, _model: &ProjectModel) -> Result<String, anyhow::Error> {
// For now, return a simple placeholder
Ok("\n\nNo tooling information available.\n".to_string())
}
pub fn render_layout_section(&self, model: &ProjectModel) -> Result<String, anyhow::Error> {
// Collect layout information from files
let mut layout_items = Vec::new();
for (file_id, file_doc) in &model.files {
layout_items.push(serde_json::json!({
"path": file_doc.path,
"purpose": "Source file",
"link": format!("docs/architecture/files/{}.md", file_id)
}));
}
// Prepare data for layout section
let data = serde_json::json!({
"layout_items": layout_items,
});
// Create a smaller template just for the layout section
let layout_template = r#"
| Path | Purpose | Link |
|------|---------|------|
{{#each layout_items}}
| {{{path}}} | {{{purpose}}} | [details]({{{link}}}) |
{{/each}}
"#;
let mut handlebars = Handlebars::new();
handlebars.register_template_string("layout", layout_template)
.map_err(|e| anyhow::anyhow!("Failed to register layout template: {}", e))?;
handlebars.render("layout", &data)
.map_err(|e| anyhow::anyhow!("Failed to render layout section: {}", e))
}
pub fn render_modules_index_section(&self, model: &ProjectModel) -> Result<String, anyhow::Error> {
// Collect module information
let mut modules = Vec::new();
for (module_id, module) in &model.modules {
modules.push(serde_json::json!({
"name": module_id,
"symbol_count": module.symbols.len(),
"inbound_count": module.inbound_modules.len(),
"outbound_count": module.outbound_modules.len(),
"link": format!("docs/architecture/modules/{}.md", module_id)
}));
}
// Prepare data for modules index section
let data = serde_json::json!({
"modules": modules,
});
// Create a smaller template just for the modules index section
let modules_template = r#"
| Module | Symbols | Inbound | Outbound | Link |
|--------|---------|---------|----------|------|
{{#each modules}}
| {{{name}}} | {{{symbol_count}}} | {{{inbound_count}}} | {{{outbound_count}}} | [details]({{{link}}}) |
{{/each}}
"#;
let mut handlebars = Handlebars::new();
handlebars.register_template_string("modules_index", modules_template)
.map_err(|e| anyhow::anyhow!("Failed to register modules_index template: {}", e))?;
handlebars.render("modules_index", &data)
.map_err(|e| anyhow::anyhow!("Failed to render modules index section: {}", e))
}
pub fn render_critical_points_section(&self, model: &ProjectModel) -> Result<String, anyhow::Error> {
// Collect critical points information
let mut high_fan_in = Vec::new();
let mut high_fan_out = Vec::new();
for (symbol_id, symbol) in &model.symbols {
if symbol.metrics.fan_in > 5 { // Threshold for high fan-in
high_fan_in.push(serde_json::json!({
"symbol": symbol_id,
"count": symbol.metrics.fan_in,
"critical": symbol.metrics.is_critical,
}));
}
if symbol.metrics.fan_out > 5 { // Threshold for high fan-out
high_fan_out.push(serde_json::json!({
"symbol": symbol_id,
"count": symbol.metrics.fan_out,
"critical": symbol.metrics.is_critical,
}));
}
}
// Prepare data for critical points section
let data = serde_json::json!({
"high_fan_in": high_fan_in,
"high_fan_out": high_fan_out,
"cycles": Vec::<String>::new(), // TODO: Implement cycle detection
});
// Create a smaller template just for the critical points section
let critical_points_template = r#"
### High Fan-in (Most Called)
| Symbol | Fan-in | Critical |
|--------|--------|----------|
{{#each high_fan_in}}
| {{{symbol}}} | {{{count}}} | {{{critical}}} |
{{/each}}
### High Fan-out (Calls Many)
| Symbol | Fan-out | Critical |
|--------|---------|----------|
{{#each high_fan_out}}
| {{{symbol}}} | {{{count}}} | {{{critical}}} |
{{/each}}
### Module Cycles
{{#each cycles}}
- {{{this}}}
{{/each}}
"#;
let mut handlebars = Handlebars::new();
handlebars.register_template_string("critical_points", critical_points_template)
.map_err(|e| anyhow::anyhow!("Failed to register critical_points template: {}", e))?;
handlebars.render("critical_points", &data)
.map_err(|e| anyhow::anyhow!("Failed to render critical points section: {}", e))
}
}

View File

@@ -0,0 +1,86 @@
//! File scanner for ArchDoc
//!
//! This module handles scanning the file system for Python files according to
//! the configuration settings.
use crate::config::Config;
use crate::errors::ArchDocError;
use std::path::{Path, PathBuf};
use walkdir::WalkDir;
pub struct FileScanner {
config: Config,
}
impl FileScanner {
pub fn new(config: Config) -> Self {
Self { config }
}
pub fn scan_python_files(&self, root: &Path) -> Result<Vec<PathBuf>, ArchDocError> {
// Check if root directory exists
if !root.exists() {
return Err(ArchDocError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("Root directory does not exist: {}", root.display())
)));
}
if !root.is_dir() {
return Err(ArchDocError::Io(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
format!("Root path is not a directory: {}", root.display())
)));
}
let mut python_files = Vec::new();
// Walk directory tree respecting include/exclude patterns
for entry in WalkDir::new(root)
.follow_links(self.config.scan.follow_symlinks)
.into_iter() {
let entry = entry.map_err(|e| {
ArchDocError::Io(std::io::Error::new(
std::io::ErrorKind::Other,
format!("Failed to read directory entry: {}", e)
))
})?;
let path = entry.path();
// Skip excluded paths
if self.is_excluded(path) {
if path.is_dir() {
continue;
} else {
continue;
}
}
// Include Python files
if path.extension().and_then(|s| s.to_str()) == Some("py") {
python_files.push(path.to_path_buf());
}
}
Ok(python_files)
}
fn is_excluded(&self, path: &Path) -> bool {
// Convert path to string for pattern matching
let path_str = match path.to_str() {
Some(s) => s,
None => return false, // If we can't convert to string, don't exclude
};
// Check if path matches any exclude patterns
for pattern in &self.config.scan.exclude {
if path_str.contains(pattern) {
return true;
}
}
false
}
}

237
archdoc-core/src/writer.rs Normal file
View File

@@ -0,0 +1,237 @@
//! Diff-aware file writer for ArchDoc
//!
//! This module handles writing generated documentation to files while preserving
//! manual content and only updating generated sections.
use crate::errors::ArchDocError;
use std::path::Path;
use std::fs;
use chrono::Utc;
#[derive(Debug)]
pub struct SectionMarker {
pub name: String,
pub start_pos: usize,
pub end_pos: usize,
}
#[derive(Debug)]
pub struct SymbolMarker {
pub symbol_id: String,
pub start_pos: usize,
pub end_pos: usize,
}
pub struct DiffAwareWriter {
// Configuration
}
impl DiffAwareWriter {
pub fn new() -> Self {
Self {}
}
pub fn update_file_with_markers(
&self,
file_path: &Path,
generated_content: &str,
section_name: &str,
) -> Result<(), ArchDocError> {
// Read existing file
let existing_content = if file_path.exists() {
fs::read_to_string(file_path)
.map_err(|e| ArchDocError::Io(e))?
} else {
// Create new file with template
let template_content = self.create_template_file(file_path, section_name)?;
// Write template to file
fs::write(file_path, &template_content)
.map_err(|e| ArchDocError::Io(e))?;
template_content
};
// Find section markers
let markers = self.find_section_markers(&existing_content, section_name)?;
if let Some(marker) = markers.first() {
// Replace content between markers
let new_content = self.replace_section_content(
&existing_content,
marker,
generated_content,
)?;
// Check if content has changed
let content_changed = existing_content != new_content;
// Write updated content
if content_changed {
let updated_content = self.update_timestamp(new_content)?;
fs::write(file_path, updated_content)
.map_err(|e| ArchDocError::Io(e))?;
} else {
// Content hasn't changed, but we might still need to update timestamp
// TODO: Implement timestamp update logic based on config
fs::write(file_path, new_content)
.map_err(|e| ArchDocError::Io(e))?;
}
}
Ok(())
}
pub fn update_symbol_section(
&self,
_file_path: &Path,
_symbol_id: &str,
_generated_content: &str,
) -> Result<(), ArchDocError> {
// Similar to section update but for symbol-specific markers
todo!("Implement symbol section update")
}
fn find_section_markers(&self, content: &str, section_name: &str) -> Result<Vec<SectionMarker>, ArchDocError> {
let begin_marker = format!("<!-- ARCHDOC:BEGIN section={} -->", section_name);
let end_marker = format!("<!-- ARCHDOC:END section={} -->", section_name);
let mut markers = Vec::new();
let mut pos = 0;
while let Some(begin_pos) = content[pos..].find(&begin_marker) {
let absolute_begin = pos + begin_pos;
let search_start = absolute_begin + begin_marker.len();
if let Some(end_pos) = content[search_start..].find(&end_marker) {
let absolute_end = search_start + end_pos + end_marker.len();
markers.push(SectionMarker {
name: section_name.to_string(),
start_pos: absolute_begin,
end_pos: absolute_end,
});
pos = absolute_end;
} else {
break;
}
}
Ok(markers)
}
fn replace_section_content(
&self,
content: &str,
marker: &SectionMarker,
new_content: &str,
) -> Result<String, ArchDocError> {
let before = &content[..marker.start_pos];
let after = &content[marker.end_pos..];
let begin_marker = format!("<!-- ARCHDOC:BEGIN section={} -->", marker.name);
let end_marker = format!("<!-- ARCHDOC:END section={} -->", marker.name);
Ok(format!(
"{}{}{}{}{}",
before, begin_marker, new_content, end_marker, after
))
}
fn update_timestamp(&self, content: String) -> Result<String, ArchDocError> {
// Update the "Updated" field in the document metadata section
// Find the metadata section and update the timestamp
let today = Utc::now().format("%Y-%m-%d").to_string();
// Look for the "Updated:" line and replace it
let lines: Vec<&str> = content.lines().collect();
let mut updated_lines = Vec::new();
for line in lines {
if line.trim_start().starts_with("- **Updated:**") {
updated_lines.push(format!("- **Updated:** {}", today));
} else {
updated_lines.push(line.to_string());
}
}
Ok(updated_lines.join("\n"))
}
fn create_template_file(&self, _file_path: &Path, template_type: &str) -> Result<String, ArchDocError> {
// Create file with appropriate template based on type
match template_type {
"architecture" => {
let template = r#"# ARCHITECTURE — New Project
<!-- MANUAL:BEGIN -->
## Project summary
**Name:** New Project
**Description:** <FILL_MANUALLY: what this project does in 37 lines>
## Key decisions (manual)
- <FILL_MANUALLY>
## Non-goals (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
---
## Document metadata
- **Created:** 2026-01-25
- **Updated:** 2026-01-25
- **Generated by:** archdoc (cli) v0.1
---
## Rails / Tooling
<!-- ARCHDOC:BEGIN section=rails -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=rails -->
---
## Repository layout (top-level)
<!-- ARCHDOC:BEGIN section=layout -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=layout -->
---
## Modules index
<!-- ARCHDOC:BEGIN section=modules_index -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=modules_index -->
---
## Integrations
<!-- ARCHDOC:BEGIN section=integrations -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=integrations -->
---
## Critical dependency points
<!-- ARCHDOC:BEGIN section=critical_points -->
> Generated. Do not edit inside this block.
<!-- ARCHDOC:END section=critical_points -->
---
<!-- MANUAL:BEGIN -->
## Change notes (manual)
- <FILL_MANUALLY>
<!-- MANUAL:END -->
"#;
Ok(template.to_string())
}
_ => {
Ok("".to_string())
}
}
}
}

View File

@@ -0,0 +1,100 @@
//! Caching tests for ArchDoc
//!
//! These tests verify that the caching functionality works correctly.
use std::path::Path;
use std::fs;
use tempfile::TempDir;
use archdoc_core::{Config, python_analyzer::PythonAnalyzer};
#[test]
fn test_cache_store_and_retrieve() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
def hello():
return "Hello, World!"
class Calculator:
def add(self, a, b):
return a + b
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module for the first time
let parsed_module1 = analyzer.parse_module(&temp_file)
.expect("Failed to parse module first time");
// Parse the module again - should come from cache
let parsed_module2 = analyzer.parse_module(&temp_file)
.expect("Failed to parse module second time");
// Both parses should return the same data
assert_eq!(parsed_module1.path, parsed_module2.path);
assert_eq!(parsed_module1.module_path, parsed_module2.module_path);
assert_eq!(parsed_module1.imports.len(), parsed_module2.imports.len());
assert_eq!(parsed_module1.symbols.len(), parsed_module2.symbols.len());
assert_eq!(parsed_module1.calls.len(), parsed_module2.calls.len());
}
#[test]
fn test_cache_invalidation_on_file_change() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code1 = r#"
def hello():
return "Hello, World!"
"#;
fs::write(&temp_file, python_code1).expect("Failed to write test file");
// Parse the module for the first time
let parsed_module1 = analyzer.parse_module(&temp_file)
.expect("Failed to parse module first time");
// Modify the file
let python_code2 = r#"
def hello():
return "Hello, World!"
def goodbye():
return "Goodbye, World!"
"#;
fs::write(&temp_file, python_code2).expect("Failed to write test file");
// Parse the module again - should NOT come from cache due to file change
let parsed_module2 = analyzer.parse_module(&temp_file)
.expect("Failed to parse module second time");
// The second parse should have more symbols
assert!(parsed_module2.symbols.len() >= parsed_module1.symbols.len());
}
#[test]
fn test_cache_disabled() {
let mut config = Config::default();
config.caching.enabled = false;
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
def hello():
return "Hello, World!"
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module - should work even with caching disabled
let parsed_module = analyzer.parse_module(&temp_file)
.expect("Failed to parse module with caching disabled");
assert_eq!(parsed_module.symbols.len(), 1);
}

View File

@@ -0,0 +1,131 @@
//! Enhanced analysis tests for ArchDoc
//!
//! These tests verify that the enhanced analysis functionality works correctly
//! with complex code that includes integrations, calls, and docstrings.
use std::fs;
use std::path::Path;
use archdoc_core::{Config, scanner::FileScanner, python_analyzer::PythonAnalyzer};
#[test]
fn test_enhanced_analysis_with_integrations() {
// Print current directory for debugging
let current_dir = std::env::current_dir().unwrap();
println!("Current directory: {:?}", current_dir);
// Try different paths for the config file
let possible_paths = [
"tests/golden/test_project/archdoc.toml",
"../tests/golden/test_project/archdoc.toml",
];
let config_path = possible_paths.iter().find(|&path| {
Path::new(path).exists()
}).expect("Could not find config file in any expected location");
println!("Using config path: {:?}", config_path);
let config = Config::load_from_file(Path::new(config_path)).expect("Failed to load config");
// Initialize scanner with the correct root path
let project_root = Path::new("tests/golden/test_project");
let scanner = FileScanner::new(config.clone());
// Scan for Python files
let python_files = scanner.scan_python_files(project_root)
.expect("Failed to scan Python files");
println!("Found Python files: {:?}", python_files);
// Should find both example.py and advanced_example.py
assert_eq!(python_files.len(), 2);
// Initialize Python analyzer
let analyzer = PythonAnalyzer::new(config.clone());
// Parse each Python file
let mut parsed_modules = Vec::new();
for file_path in python_files {
println!("Parsing file: {:?}", file_path);
match analyzer.parse_module(&file_path) {
Ok(module) => {
println!("Successfully parsed module: {:?}", module.module_path);
println!("Imports: {:?}", module.imports);
println!("Symbols: {:?}", module.symbols.len());
println!("Calls: {:?}", module.calls.len());
parsed_modules.push(module);
},
Err(e) => {
panic!("Failed to parse {}: {}", file_path.display(), e);
}
}
}
println!("Parsed {} modules", parsed_modules.len());
// Resolve symbols and build project model
let project_model = analyzer.resolve_symbols(&parsed_modules)
.expect("Failed to resolve symbols");
println!("Project model modules: {}", project_model.modules.len());
println!("Project model files: {}", project_model.files.len());
println!("Project model symbols: {}", project_model.symbols.len());
// Add assertions to verify the project model
assert!(!project_model.modules.is_empty());
assert!(!project_model.files.is_empty());
assert!(!project_model.symbols.is_empty());
// Check that we have the right number of modules (2 files = 2 modules)
assert_eq!(project_model.modules.len(), 2);
// Check that we have the right number of files
assert_eq!(project_model.files.len(), 2);
// Check that we have the right number of symbols
// The actual number might be less due to deduplication or other factors
// but should be at least the sum of symbols from both files minus duplicates
assert!(project_model.symbols.len() >= 10);
// Check specific details about the advanced example module
let mut found_advanced_module = false;
for (_, module) in project_model.modules.iter() {
if module.path.contains("advanced_example.py") {
found_advanced_module = true;
break;
}
}
assert!(found_advanced_module);
// Check that we found the UserService class with DB integration
let user_service_symbol = project_model.symbols.values().find(|s| s.id == "UserService");
assert!(user_service_symbol.is_some());
assert_eq!(user_service_symbol.unwrap().kind, archdoc_core::model::SymbolKind::Class);
// Check that we found the NotificationService class with queue integration
let notification_service_symbol = project_model.symbols.values().find(|s| s.id == "NotificationService");
assert!(notification_service_symbol.is_some());
assert_eq!(notification_service_symbol.unwrap().kind, archdoc_core::model::SymbolKind::Class);
// Check that we found the fetch_external_user_data function with HTTP integration
let fetch_external_user_data_symbol = project_model.symbols.values().find(|s| s.id == "fetch_external_user_data");
assert!(fetch_external_user_data_symbol.is_some());
assert_eq!(fetch_external_user_data_symbol.unwrap().kind, archdoc_core::model::SymbolKind::Function);
// Check file imports
let mut found_advanced_file = false;
for (_, file_doc) in project_model.files.iter() {
if file_doc.path.contains("advanced_example.py") {
found_advanced_file = true;
assert!(!file_doc.imports.is_empty());
// Should have imports for requests, sqlite3, redis, typing
let import_names: Vec<&String> = file_doc.imports.iter().collect();
assert!(import_names.contains(&&"requests".to_string()));
assert!(import_names.contains(&&"sqlite3".to_string()));
assert!(import_names.contains(&&"redis".to_string()));
assert!(import_names.contains(&&"typing.List".to_string()) || import_names.contains(&&"typing".to_string()));
break;
}
}
assert!(found_advanced_file);
}

View File

@@ -0,0 +1,83 @@
//! Error handling tests for ArchDoc
//!
//! These tests verify that ArchDoc properly handles various error conditions
//! and edge cases.
use std::path::Path;
use std::fs;
use tempfile::TempDir;
use archdoc_core::{Config, scanner::FileScanner, python_analyzer::PythonAnalyzer};
#[test]
fn test_scanner_nonexistent_directory() {
let config = Config::default();
let scanner = FileScanner::new(config);
// Try to scan a nonexistent directory
let result = scanner.scan_python_files(Path::new("/nonexistent/directory"));
assert!(result.is_err());
// Check that we get an IO error
match result.unwrap_err() {
archdoc_core::errors::ArchDocError::Io(_) => {},
_ => panic!("Expected IO error"),
}
}
#[test]
fn test_scanner_file_instead_of_directory() {
let config = Config::default();
let scanner = FileScanner::new(config);
// Create a temporary file
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.txt");
fs::write(&temp_file, "test content").expect("Failed to write test file");
// Try to scan a file instead of a directory
let result = scanner.scan_python_files(&temp_file);
assert!(result.is_err());
// Check that we get an IO error
match result.unwrap_err() {
archdoc_core::errors::ArchDocError::Io(_) => {},
_ => panic!("Expected IO error"),
}
}
#[test]
fn test_analyzer_nonexistent_file() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Try to parse a nonexistent file
let result = analyzer.parse_module(Path::new("/nonexistent/file.py"));
assert!(result.is_err());
// Check that we get an IO error
match result.unwrap_err() {
archdoc_core::errors::ArchDocError::Io(_) => {},
_ => panic!("Expected IO error"),
}
}
#[test]
fn test_analyzer_invalid_python_syntax() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary file with invalid Python syntax
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("invalid.py");
fs::write(&temp_file, "invalid python syntax @@#$%").expect("Failed to write test file");
// Try to parse the file
let result = analyzer.parse_module(&temp_file);
assert!(result.is_err());
// Check that we get a parse error
match result.unwrap_err() {
archdoc_core::errors::ArchDocError::ParseError { .. } => {},
_ => panic!("Expected parse error"),
}
}

View File

@@ -0,0 +1,60 @@
# Architecture Documentation
Generated at: 1970-01-01 00:00:00 UTC
## Overview
This document provides an overview of the architecture for the project.
## Modules
### example.py
File: `example.py`
#### Imports
- `os`
- `typing.List`
#### Symbols
##### Calculator
- Type: Class
- Signature: `class Calculator`
- Purpose: extracted from AST
##### Calculator.__init__
- Type: Function
- Signature: `def __init__(...)`
- Purpose: extracted from AST
##### Calculator.add
- Type: Function
- Signature: `def add(...)`
- Purpose: extracted from AST
##### Calculator.multiply
- Type: Function
- Signature: `def multiply(...)`
- Purpose: extracted from AST
##### process_numbers
- Type: Function
- Signature: `def process_numbers(...)`
- Purpose: extracted from AST
## Metrics
### Critical Components
No critical components identified.
### Component Dependencies
Dependency analysis not yet implemented.

View File

@@ -0,0 +1,107 @@
//! Golden tests for ArchDoc
//!
//! These tests generate documentation for test projects and compare the output
//! with expected "golden" files to ensure consistency.
mod test_utils;
use std::fs;
use std::path::Path;
use archdoc_core::{Config, scanner::FileScanner, python_analyzer::PythonAnalyzer};
#[test]
fn test_simple_project_generation() {
// Print current directory for debugging
let current_dir = std::env::current_dir().unwrap();
println!("Current directory: {:?}", current_dir);
// Try different paths for the config file
let possible_paths = [
"tests/golden/test_project/archdoc.toml",
"../tests/golden/test_project/archdoc.toml",
];
let config_path = possible_paths.iter().find(|&path| {
Path::new(path).exists()
}).expect("Could not find config file in any expected location");
println!("Using config path: {:?}", config_path);
let config = Config::load_from_file(Path::new(config_path)).expect("Failed to load config");
// Initialize scanner with the correct root path
let project_root = Path::new("tests/golden/test_project");
let scanner = FileScanner::new(config.clone());
// Scan for Python files
let python_files = scanner.scan_python_files(project_root)
.expect("Failed to scan Python files");
println!("Found Python files: {:?}", python_files);
// Initialize Python analyzer
let analyzer = PythonAnalyzer::new(config.clone());
// Parse each Python file
let mut parsed_modules = Vec::new();
for file_path in python_files {
println!("Parsing file: {:?}", file_path);
match analyzer.parse_module(&file_path) {
Ok(module) => {
println!("Successfully parsed module: {:?}", module.module_path);
println!("Imports: {:?}", module.imports);
println!("Symbols: {:?}", module.symbols.len());
println!("Calls: {:?}", module.calls.len());
parsed_modules.push(module);
},
Err(e) => {
panic!("Failed to parse {}: {}", file_path.display(), e);
}
}
}
println!("Parsed {} modules", parsed_modules.len());
// Resolve symbols and build project model
let project_model = analyzer.resolve_symbols(&parsed_modules)
.expect("Failed to resolve symbols");
println!("Project model modules: {}", project_model.modules.len());
println!("Project model files: {}", project_model.files.len());
println!("Project model symbols: {}", project_model.symbols.len());
// Add assertions to verify the project model
assert!(!project_model.modules.is_empty());
assert!(!project_model.files.is_empty());
assert!(!project_model.symbols.is_empty());
// Check specific details about the parsed modules
// Now we have 2 modules (example.py and advanced_example.py)
assert_eq!(project_model.modules.len(), 2);
// Find the example.py module
let mut found_example_module = false;
for (_, module) in project_model.modules.iter() {
if module.path.contains("example.py") {
found_example_module = true;
break;
}
}
assert!(found_example_module);
// Check that we found the Calculator class
let calculator_symbol = project_model.symbols.values().find(|s| s.id == "Calculator");
assert!(calculator_symbol.is_some());
assert_eq!(calculator_symbol.unwrap().kind, archdoc_core::model::SymbolKind::Class);
// Check that we found the process_numbers function
let process_numbers_symbol = project_model.symbols.values().find(|s| s.id == "process_numbers");
assert!(process_numbers_symbol.is_some());
assert_eq!(process_numbers_symbol.unwrap().kind, archdoc_core::model::SymbolKind::Function);
// Check file imports
assert!(!project_model.files.is_empty());
let file_entry = project_model.files.iter().next().unwrap();
let file_doc = file_entry.1;
assert!(!file_doc.imports.is_empty());
}

View File

@@ -0,0 +1,107 @@
"""Advanced example module for testing with integrations."""
import requests
import sqlite3
import redis
from typing import List, Dict
class UserService:
"""A service for managing users with database integration."""
def __init__(self, db_path: str = "users.db"):
"""Initialize the user service with database path."""
self.db_path = db_path
self._init_db()
def _init_db(self):
"""Initialize the database."""
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
)
""")
conn.commit()
conn.close()
def create_user(self, name: str, email: str) -> Dict:
"""Create a new user in the database."""
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
cursor.execute(
"INSERT INTO users (name, email) VALUES (?, ?)",
(name, email)
)
user_id = cursor.lastrowid
conn.commit()
conn.close()
return {"id": user_id, "name": name, "email": email}
def get_user(self, user_id: int) -> Dict:
"""Get a user by ID from the database."""
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
row = cursor.fetchone()
conn.close()
if row:
return {"id": row[0], "name": row[1], "email": row[2]}
return None
class NotificationService:
"""A service for sending notifications with queue integration."""
def __init__(self, redis_url: str = "redis://localhost:6379"):
"""Initialize the notification service with Redis URL."""
self.redis_client = redis.Redis.from_url(redis_url)
def send_email_notification(self, user_id: int, message: str) -> bool:
"""Send an email notification by queuing it."""
notification = {
"user_id": user_id,
"message": message,
"type": "email"
}
# Push to Redis queue
self.redis_client.lpush("notifications", str(notification))
return True
def fetch_external_user_data(user_id: int) -> Dict:
"""Fetch user data from an external API."""
response = requests.get(f"https://api.example.com/users/{user_id}")
if response.status_code == 200:
return response.json()
return {}
def process_users(user_ids: List[int]) -> List[Dict]:
"""Process a list of users with various integrations."""
# Database integration
user_service = UserService()
# Queue integration
notification_service = NotificationService()
results = []
for user_id in user_ids:
# Database operation
user = user_service.get_user(user_id)
if user:
# External API integration
external_data = fetch_external_user_data(user_id)
user.update(external_data)
# Queue operation
notification_service.send_email_notification(
user_id,
f"Processing user {user['name']}"
)
results.append(user)
return results

View File

@@ -0,0 +1,29 @@
"""Example module for testing."""
import os
from typing import List
class Calculator:
"""A simple calculator class."""
def __init__(self):
"""Initialize the calculator."""
pass
def add(self, a: int, b: int) -> int:
"""Add two numbers."""
return a + b
def multiply(self, a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
def process_numbers(numbers: List[int]) -> List[int]:
"""Process a list of numbers."""
calc = Calculator()
return [calc.add(n, 1) for n in numbers]
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5]
result = process_numbers(numbers)
print(f"Processed numbers: {result}")

View File

@@ -0,0 +1,21 @@
//! Test utilities for golden tests
use std::fs;
use std::path::Path;
/// Read a file and return its contents
pub fn read_test_file(path: &str) -> String {
fs::read_to_string(path).expect(&format!("Failed to read test file: {}", path))
}
/// Write content to a file for testing
pub fn write_test_file(path: &str, content: &str) {
fs::write(path, content).expect(&format!("Failed to write test file: {}", path))
}
/// Compare two strings and panic if they don't match
pub fn assert_strings_equal(actual: &str, expected: &str, message: &str) {
if actual != expected {
panic!("{}: Strings do not match\nActual:\n{}\nExpected:\n{}", message, actual, expected);
}
}

View File

@@ -0,0 +1,134 @@
//! Integration detection tests for ArchDoc
//!
//! These tests verify that the integration detection functionality works correctly.
use std::fs;
use tempfile::TempDir;
use archdoc_core::{Config, python_analyzer::PythonAnalyzer};
#[test]
fn test_http_integration_detection() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file with HTTP integration
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
import requests
def fetch_data():
response = requests.get("https://api.example.com/data")
return response.json()
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module
let parsed_module = analyzer.parse_module(&temp_file)
.expect("Failed to parse module");
// Check that we found the function
assert_eq!(parsed_module.symbols.len(), 1);
let symbol = &parsed_module.symbols[0];
assert_eq!(symbol.id, "fetch_data");
// Check that HTTP integration is detected
assert!(symbol.integrations_flags.http);
assert!(!symbol.integrations_flags.db);
assert!(!symbol.integrations_flags.queue);
}
#[test]
fn test_db_integration_detection() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file with DB integration
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
import sqlite3
def get_user(user_id):
conn = sqlite3.connect("database.db")
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
return cursor.fetchone()
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module
let parsed_module = analyzer.parse_module(&temp_file)
.expect("Failed to parse module");
// Check that we found the function
assert_eq!(parsed_module.symbols.len(), 1);
let symbol = &parsed_module.symbols[0];
assert_eq!(symbol.id, "get_user");
// Check that DB integration is detected
assert!(!symbol.integrations_flags.http);
assert!(symbol.integrations_flags.db);
assert!(!symbol.integrations_flags.queue);
}
#[test]
fn test_queue_integration_detection() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file with queue integration
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
import redis
def process_job(job_data):
client = redis.Redis()
client.lpush("job_queue", job_data)
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module
let parsed_module = analyzer.parse_module(&temp_file)
.expect("Failed to parse module");
// Check that we found the function
assert_eq!(parsed_module.symbols.len(), 1);
let symbol = &parsed_module.symbols[0];
assert_eq!(symbol.id, "process_job");
// Check that queue integration is detected
assert!(!symbol.integrations_flags.http);
assert!(!symbol.integrations_flags.db);
assert!(symbol.integrations_flags.queue);
}
#[test]
fn test_no_integration_detection() {
let config = Config::default();
let analyzer = PythonAnalyzer::new(config);
// Create a temporary Python file with no integrations
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let temp_file = temp_dir.path().join("test.py");
let python_code = r#"
def calculate_sum(a, b):
return a + b
"#;
fs::write(&temp_file, python_code).expect("Failed to write test file");
// Parse the module
let parsed_module = analyzer.parse_module(&temp_file)
.expect("Failed to parse module");
// Check that we found the function
assert_eq!(parsed_module.symbols.len(), 1);
let symbol = &parsed_module.symbols[0];
assert_eq!(symbol.id, "calculate_sum");
// Check that no integrations are detected
assert!(!symbol.integrations_flags.http);
assert!(!symbol.integrations_flags.db);
assert!(!symbol.integrations_flags.queue);
}

View File

@@ -0,0 +1,13 @@
//! Integration tests for ArchDoc
// Include golden tests
mod golden;
mod error_handling;
mod caching;
mod integration_detection;
mod enhanced_analysis;
// Run all tests
fn main() {
// This is just a placeholder - tests are run by cargo test
}

View File

@@ -0,0 +1,93 @@
//! Tests for analyzing the test project
use archdoc_core::{
config::Config,
python_analyzer::PythonAnalyzer,
};
use std::path::Path;
#[test]
fn test_project_analysis() {
// Load config from test project
let config = Config::load_from_file(Path::new("../test-project/archdoc.toml")).unwrap();
// Initialize analyzer
let analyzer = PythonAnalyzer::new(config);
// Parse core module
let core_module = analyzer.parse_module(Path::new("../test-project/src/core.py")).unwrap();
println!("Core module symbols: {}", core_module.symbols.len());
for symbol in &core_module.symbols {
println!(" Symbol: {} ({:?}), DB: {}, HTTP: {}", symbol.id, symbol.kind, symbol.integrations_flags.db, symbol.integrations_flags.http);
}
println!("Core module calls: {}", core_module.calls.len());
for call in &core_module.calls {
println!(" Call: {} -> {}", call.caller_symbol, call.callee_expr);
}
// Check that we found symbols
assert!(!core_module.symbols.is_empty()); // Should find at least the main symbols
// Check that we found calls
assert!(!core_module.calls.is_empty());
// Check that integrations are detected
let db_integration_found = core_module.symbols.iter().any(|s| s.integrations_flags.db);
let http_integration_found = core_module.symbols.iter().any(|s| s.integrations_flags.http);
assert!(db_integration_found, "Database integration should be detected");
assert!(http_integration_found, "HTTP integration should be detected");
// Parse utils module
let utils_module = analyzer.parse_module(Path::new("../test-project/src/utils.py")).unwrap();
println!("Utils module symbols: {}", utils_module.symbols.len());
for symbol in &utils_module.symbols {
println!(" Symbol: {} ({:?}), DB: {}, HTTP: {}", symbol.id, symbol.kind, symbol.integrations_flags.db, symbol.integrations_flags.http);
}
// Check that we found symbols
assert!(!utils_module.symbols.is_empty());
}
#[test]
fn test_full_project_resolution() {
// Load config from test project
let config = Config::load_from_file(Path::new("../test-project/archdoc.toml")).unwrap();
// Initialize analyzer
let analyzer = PythonAnalyzer::new(config);
// Parse all modules
let core_module = analyzer.parse_module(Path::new("../test-project/src/core.py")).unwrap();
let utils_module = analyzer.parse_module(Path::new("../test-project/src/utils.py")).unwrap();
let modules = vec![core_module, utils_module];
// Resolve symbols
let project_model = analyzer.resolve_symbols(&modules).unwrap();
// Check project model
assert!(!project_model.modules.is_empty());
assert!(!project_model.symbols.is_empty());
assert!(!project_model.files.is_empty());
// Check that integrations are preserved in the project model
let db_integration_found = project_model.symbols.values().any(|s| s.integrations_flags.db);
let http_integration_found = project_model.symbols.values().any(|s| s.integrations_flags.http);
assert!(db_integration_found, "Database integration should be preserved in project model");
assert!(http_integration_found, "HTTP integration should be preserved in project model");
println!("Project modules: {:?}", project_model.modules.keys().collect::<Vec<_>>());
println!("Project symbols: {}", project_model.symbols.len());
// Print integration information
for (id, symbol) in &project_model.symbols {
if symbol.integrations_flags.db || symbol.integrations_flags.http {
println!("Symbol {} has DB: {}, HTTP: {}", id, symbol.integrations_flags.db, symbol.integrations_flags.http);
}
}
}

View File

@@ -0,0 +1,85 @@
//! Tests for the renderer functionality
use archdoc_core::{
model::{ProjectModel, Symbol, SymbolKind, IntegrationFlags, SymbolMetrics},
renderer::Renderer,
};
use std::collections::HashMap;
#[test]
fn test_render_with_integrations() {
// Create a mock project model with integration information
let mut project_model = ProjectModel::new();
// Add a symbol with database integration
let db_symbol = Symbol {
id: "DatabaseManager".to_string(),
kind: SymbolKind::Class,
module_id: "test_module".to_string(),
file_id: "test_file.py".to_string(),
qualname: "DatabaseManager".to_string(),
signature: "class DatabaseManager".to_string(),
annotations: None,
docstring_first_line: None,
purpose: "test".to_string(),
outbound_calls: vec![],
inbound_calls: vec![],
integrations_flags: IntegrationFlags {
db: true,
http: false,
queue: false,
},
metrics: SymbolMetrics {
fan_in: 0,
fan_out: 0,
is_critical: false,
cycle_participant: false,
},
};
// Add a symbol with HTTP integration
let http_symbol = Symbol {
id: "fetch_data".to_string(),
kind: SymbolKind::Function,
module_id: "test_module".to_string(),
file_id: "test_file.py".to_string(),
qualname: "fetch_data".to_string(),
signature: "def fetch_data()".to_string(),
annotations: None,
docstring_first_line: None,
purpose: "test".to_string(),
outbound_calls: vec![],
inbound_calls: vec![],
integrations_flags: IntegrationFlags {
db: false,
http: true,
queue: false,
},
metrics: SymbolMetrics {
fan_in: 0,
fan_out: 0,
is_critical: false,
cycle_participant: false,
},
};
project_model.symbols.insert("DatabaseManager".to_string(), db_symbol);
project_model.symbols.insert("fetch_data".to_string(), http_symbol);
// Initialize renderer
let renderer = Renderer::new();
// Render architecture documentation
let result = renderer.render_architecture_md(&project_model);
assert!(result.is_ok());
let rendered_content = result.unwrap();
println!("Rendered content:\n{}", rendered_content);
// Check that integration sections are present
assert!(rendered_content.contains("## Integrations"));
assert!(rendered_content.contains("### Database Integrations"));
assert!(rendered_content.contains("### HTTP/API Integrations"));
assert!(rendered_content.contains("DatabaseManager in test_file.py"));
assert!(rendered_content.contains("fetch_data in test_file.py"));
}

22
test-project/README.md Normal file
View File

@@ -0,0 +1,22 @@
# Test Project
A test project for ArchDoc development and testing.
## Installation
```bash
pip install -e .
```
## Usage
```bash
test-project
```
## Development
Install development dependencies:
```bash
pip install -e ".[dev]"

View File

@@ -0,0 +1,22 @@
[build-system]
requires = ["setuptools>=45", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "test-project"
version = "0.1.0"
description = "A test project for ArchDoc"
authors = [
{name = "Test Author", email = "test@example.com"}
]
dependencies = [
"requests>=2.25.0",
"sqlite3"
]
[project.optional-dependencies]
dev = [
"pytest>=6.0",
"black",
"flake8"
]

View File

@@ -0,0 +1 @@
"""Test project package."""

42
test-project/src/core.py Normal file
View File

@@ -0,0 +1,42 @@
"""Core module with database and HTTP integrations."""
import sqlite3
import requests
class DatabaseManager:
"""Manages database connections and operations."""
def __init__(self, db_path: str):
self.db_path = db_path
self.connection = None
def connect(self):
"""Connect to the database."""
self.connection = sqlite3.connect(self.db_path)
def execute_query(self, query: str):
"""Execute a database query."""
if self.connection:
cursor = self.connection.cursor()
cursor.execute(query)
return cursor.fetchall()
def fetch_external_data(url: str) -> dict:
"""Fetch data from an external API."""
response = requests.get(url)
return response.json()
def process_user_data(user_id: int) -> dict:
"""Process user data with database and external API calls."""
# Database interaction
db = DatabaseManager("users.db")
db.connect()
user_data = db.execute_query(f"SELECT * FROM users WHERE id = {user_id}")
# External API call
api_data = fetch_external_data(f"https://api.example.com/users/{user_id}")
return {
"user": user_data,
"api": api_data
}

26
test-project/src/utils.py Normal file
View File

@@ -0,0 +1,26 @@
"""Utility functions for the test project."""
import json
import os
def load_config(config_path: str) -> dict:
"""Load configuration from a JSON file."""
with open(config_path, 'r') as f:
return json.load(f)
def save_config(config: dict, config_path: str):
"""Save configuration to a JSON file."""
with open(config_path, 'w') as f:
json.dump(config, f, indent=2)
def get_file_size(filepath: str) -> int:
"""Get the size of a file in bytes."""
return os.path.getsize(filepath)
def format_bytes(size: int) -> str:
"""Format bytes into a human-readable string."""
for unit in ['B', 'KB', 'MB', 'GB']:
if size < 1024.0:
return f"{size:.1f} {unit}"
size /= 1024.0
return f"{size:.1f} TB"