Files

yuetsh 42549af346 docs: consolidate AST checker design spec into clean structure

Reorganize the spec from 6 incremental updates into a well-structured
document with numbered sections, consistent formatting, and no
redundancy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-25 08:58:58 -06:00

19 KiB

Raw Blame History

AST Checker Design Spec

Tree-sitter-based code structure validation for the Online Judge platform.

1. Overview

Teachers can configure per-problem, per-language rules that validate student code structure (e.g., "must use while loop", "cannot use for loop", "must call print()"). Rules use a predefined engine library — admins never write raw tree-sitter queries.

Critical Invariant

AST check runs AFTER normal judging, ONLY on submissions that would be AC. If AST fails, the displayed result is AST_CHECK_FAILED, but all statistics treat it as AC — problem accepted_number, user profile solved status, contest ranking. The student solved the problem correctly; they just didn't use the required syntax.

Goals

Enforce coding constraints for pedagogical purposes (beginner programming courses)
Support all 6 languages: Python3, C, C++, Java, Golang, JavaScript (Python3 and C prioritized)
Predefined rule library with parameterized engines
Full admin UI for configuring rules per problem per language
New AST_CHECK_FAILED judge status with clear error messages

Non-Goals

Output-aware checks ("禁止直接输出完整目标答案") — requires expected output, not AST
String literal content matching (.2f, 03d format specifiers) — deferred
Custom tree-sitter query support for admins

2. New Judge Status

class JudgeStatus(models.IntegerChoices):
    COMPILE_ERROR = -2
    WRONG_ANSWER = -1
    ACCEPTED = 0
    CPU_TIME_LIMIT_EXCEEDED = 1
    REAL_TIME_LIMIT_EXCEEDED = 2
    MEMORY_LIMIT_EXCEEDED = 3
    RUNTIME_ERROR = 4
    SYSTEM_ERROR = 5
    PENDING = 6
    JUDGING = 7
    PARTIALLY_ACCEPTED = 8
    AST_CHECK_FAILED = 10   # 9 is taken by frontend SubmissionStatus.submitting

Helper function in submission/models.py:

def is_accepted(result):
    return result in (JudgeStatus.ACCEPTED, JudgeStatus.AST_CHECK_FAILED)

3. Data Model

Problem model — new JSONField:

ast_rules = models.JSONField(null=True, blank=True, default=None)

Schema:

{
  "Python3": [
    {"engine": "must_exist_node", "target": "for_loop", "message": "必须使用 for 循环"},
    {"engine": "count_node", "target": "while_loop", "min": 2, "message": "while 循环至少出现 2 次"},
    {"engine": "must_call_function", "target": "print", "message": "必须调用 print()"},
    {"engine": "must_use_operator", "target": "+=", "message": "必须使用 += 运算符"},
    {"engine": "must_call_method", "target": "append", "message": "必须使用 append()"}
  ],
  "C": [
    {"engine": "must_exist_node", "target": "for_loop", "message": "必须使用 for 循环"}
  ]
}

target uses language-agnostic logical names (e.g., for_loop, while_loop). Each engine maps these to language-specific tree-sitter node types via the mapping layer. When ast_rules is null or the current language has no rules, AST checking is skipped entirely.

4. Submission Flow

SubmissionAPI.post()
  → create Submission(PENDING)
  → judge_task.send()
    → JudgeDispatcher.judge()
      → apply code template
      → choose judge server
      → send to judge server
      → process judge result
      → _compute_statistic_info()
      → if result == AC and ast_rules exist for this language:
          → AST check (NEW)
          → if AST fails:
              result = AST_CHECK_FAILED (display only)
              err_info = rule violation details
      → update_problem_status (treats AST_CHECK_FAILED as AC)
      → push WebSocket with final result

The check runs on self.submission.code (raw student code), not the template-wrapped version.

Integration Code

# In JudgeDispatcher.judge(), after _compute_statistic_info and result determination:

if self.submission.result == JudgeStatus.ACCEPTED:
    ast_rules = self.problem.ast_rules
    if ast_rules and language in ast_rules:
        from ast_checker.checker import check_ast
        passed, errors = check_ast(self.submission.code, language, ast_rules[language])
        if not passed:
            self.submission.result = JudgeStatus.AST_CHECK_FAILED
            self.submission.statistic_info["err_info"] = "\n".join(errors)

self.submission.save(update_fields=["result", "info", "statistic_info"])

Statistics Storage

statistic_info uses the actual result code as the key: {"0": 5, "10": 3, "-1": 20}. This means:

accepted_number = AC + AST_CHECK_FAILED combined (for overall acceptance rate)
statistic_info retains the breakdown: 5 pure AC, 3 AST check failed, 20 WA
Frontend statistics chart can show AST_CHECK_FAILED as a separate slice

Profile Status Storage

When storing status in acm_problems_status / oi_problems_status / contest_problems_status, always store JudgeStatus.ACCEPTED (0), never AST_CHECK_FAILED (10). This ensures my_status shows as AC in the problem list and sidebar without special frontend handling.

5. Rule Engine

Directory Structure

OnlineJudge/ast_checker/
├── __init__.py
├── checker.py              # Entry point: check_ast(code, language, rules) → (bool, errors)
├── engines/
│   ├── __init__.py         # Engine registry
│   ├── base.py             # BaseEngine abstract class
│   ├── node_exists.py      # must_exist_node / must_not_exist_node
│   ├── node_count.py       # count_node
│   ├── function_call.py    # must_call_function / must_not_call_function / count_function_call
│   ├── method_call.py      # must_call_method / must_not_call_method
│   ├── operator.py         # must_use_operator
│   ├── keyword_arg.py      # must_use_keyword_arg
│   ├── import_check.py     # must_import / must_not_import
│   └── structural.py       # nested_for, chained_comparison, swap_assignment, etc.
└── mappings/
    ├── __init__.py          # get_mapping(language) dispatcher
    ├── python.py
    ├── c.py
    ├── cpp.py
    ├── java.py
    ├── go.py
    └── javascript.py

Engine Interface

class BaseEngine:
    def check(self, tree, rule, language, mapping) -> list[str]:
        """Returns error messages (empty = pass)."""
        raise NotImplementedError

Entry Point

def check_ast(code: str, language: str, rules: list[dict]) -> tuple[bool, list[str]]:
    """
    Parse code with tree-sitter, run all rules, return (passed, error_messages).
    - Empty rules → (True, [])
    - Parse failure → (True, []) — skip AST check, let compiler report errors
    """

Engine Catalog

Engine	Parameters	Description
`must_exist_node`	`target`	Node type must appear at least once
`must_not_exist_node`	`target`	Node type must not appear
`count_node`	`target`, `min?`, `max?`	Node type count within [min, max]
`must_call_function`	`target`	Must call a function (e.g., `print`, `input`)
`must_not_call_function`	`target`	Must not call a function
`count_function_call`	`target`, `min?`, `max?`	Function call count within range
`must_call_method`	`target`	Must call a method (e.g., `.append()`)
`must_not_call_method`	`target`	Must not call a method
`must_use_operator`	`target`, `category?`	Must use a specific operator (see below)
`must_use_keyword_arg`	`target` (fn), `arg_name`, `value?`	Must use keyword arg in a call
`must_import`	`target`	Must import a module
`must_not_import`	`target`	Must not import a module
`must_use_variable_name`	`target`	Must assign to a variable with this name
`must_not_use_variable_name`	`target`	Must not use a variable with this name
`nested_for`	—	Must have nested for loops
`chained_comparison`	—	Must use chained comparison (Python only)
`swap_assignment`	—	Must use swap assignment (Python only)
`chain_assignment`	—	Must use chain assignment (Python only)
`must_use_recursion`	—	Must have a self-calling function
`no_recursion`	—	No function may call itself

Operator categories (auto-inferred from target):

Arithmetic (+,-,*,/,//,%,**) → binary expressions
Augmented (+=,-=) → augmented assignments
Comparison (==,!=,>,>=,<,<=) → comparisons
Logical (and,or,not) → boolean/unary expressions
Bitwise (&,|) → binary expressions

Language Mapping

Each mapping file exports a dict translating logical names to tree-sitter node types:

# mappings/python.py
PYTHON_MAPPING = {
    "for_loop": "for_statement",
    "while_loop": "while_statement",
    "if_statement": "if_statement",
    "else_clause": "else_clause",
    "elif_clause": "elif_clause",
    "break": "break_statement",
    "continue": "continue_statement",
    "function_definition": "function_definition",
    "return": "return_statement",
    "try_except": "try_statement",
    "with_statement": "with_statement",
    "list_comprehension": "list_comprehension",
    "list_literal": "list",
    "dict_literal": "dictionary",
    "set_literal": "set",
    "f_string": "format_string",
    "import": "import_statement",
    "import_from": "import_from_statement",
    "assignment": "assignment",
    "class_definition": "class_definition",
    # Operators map to themselves in Python
    "+": "+", "-": "-", "*": "*", "/": "/", "//": "//", "%": "%", "**": "**",
    "+=": "+=", "-=": "-=",
    "==": "==", "!=": "!=", ">": ">", ">=": ">=", "<": "<", "<=": "<=",
    "and": "and", "or": "or", "not": "not",
    "&": "&", "|": "|",
}

# mappings/c.py
C_MAPPING = {
    "for_loop": "for_statement",
    "while_loop": "while_statement",
    "if_statement": "if_statement",
    "else_clause": "else_clause",
    "break": "break_statement",
    "continue": "continue_statement",
    "function_definition": "function_definition",
    "return": "return_statement",
    "assignment": "assignment_expression",
    # ... C-specific mappings
}

Known Limitations

Method call detection is name-based only: must_call_method("append") matches any .append() regardless of object type. tree-sitter has no type information. Acceptable for teaching scenarios.
Structural rules are language-specific: swap_assignment, chained_comparison, chain_assignment apply only to Python. The engine returns pass for unsupported languages.

6. Backend Impact Checklist

Every location that checks JudgeStatus.ACCEPTED must be updated to use is_accepted() or result__in=[ACCEPTED, AST_CHECK_FAILED].

6.1 `judge/dispatcher.py` — Statistics Methods (10 changes)

Line	Current Code	Change
106	`resp_data[i]["result"] == JudgeStatus.ACCEPTED`	NO CHANGE — individual test case results from judge server
205	`self.submission.result = JudgeStatus.ACCEPTED`	NO CHANGE — initial result assignment, before AST check
254	`self.last_result != JudgeStatus.ACCEPTED and self.submission.result == JudgeStatus.ACCEPTED`	→ `not is_accepted(self.last_result) and is_accepted(self.submission.result)`
264	`acm_problems_status[problem_id]["status"] != JudgeStatus.ACCEPTED`	→ `not is_accepted(...)`
266	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`
274	`oi_problems_status[problem_id]["status"] != JudgeStatus.ACCEPTED`	→ `not is_accepted(...)`
280	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`
292	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`
305-310	`acm_problems_status[problem_id] = {"status": self.submission.result, ...}`	→ store `JudgeStatus.ACCEPTED` as status when `is_accepted()`
308	`acm_problems_status[problem_id]["status"] != JudgeStatus.ACCEPTED`	→ `not is_accepted(...)`
310	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`
320-331	OI mode — same pattern as ACM	Same changes

6.2 `judge/dispatcher.py` — `update_contest_problem_status()` (5 changes)

Line	Current Code	Change
344	`{"status": self.submission.result, ...}`	→ store `JudgeStatus.ACCEPTED` when `is_accepted()`
345	`contest_problems_status[problem_id]["status"] != JudgeStatus.ACCEPTED`	→ `not is_accepted(...)`
346	`contest_problems_status[problem_id]["status"] = self.submission.result`	→ store `JudgeStatus.ACCEPTED` when `is_accepted()`
357,362	OI mode — same pattern	Same changes
371	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`

6.3 `judge/dispatcher.py` — `_update_acm_contest_rank()` (2 changes)

Line	Current Code	Change
409	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`
424	`self.submission.result == JudgeStatus.ACCEPTED`	→ `is_accepted(...)`

Lines 417/433 (!= COMPILE_ERROR → increment error_number) are automatically correct once 409/424 are fixed. Without this fix, AST_CHECK_FAILED would fall into the elif branch and incorrectly add 20 minutes penalty.

6.4 `judge/dispatcher.py` — `_update_oi_contest_rank()`

NO CHANGE needed. OI rank uses statistic_info["score"] set by _compute_statistic_info() before AST check. AST check does not modify the score.

6.5 Other Backend Files (11 changes)

File	Line(s)	Change
`account/views/oj.py`	468, 483	`result=JudgeStatus.ACCEPTED` → `result__in=[ACCEPTED, AST_CHECK_FAILED]`
`comment/views/oj.py`	31	Same
`contest/views/admin.py`	220	Same
`problem/views/oj.py`	199, 210	Same
`problem/views/oj.py`	241	NO CHANGE — profile stores ACCEPTED(0)
`problem/views/admin.py`	530, 596	`Count("id", filter=Q(result=JudgeStatus.ACCEPTED))` → `Q(result__in=[...])`
`problem/views/admin.py`	444, 472	NO CHANGE — full resets
`problemset/views/oj.py`	190	`result != JudgeStatus.ACCEPTED` → `not is_accepted(result)`
`problemset/management/commands/fix_problemset_progress.py`	41	`result=JudgeStatus.ACCEPTED` → `result__in=[...]`
`class_pk/views/oj.py`	280, 291	Same
`submission/views/admin.py`	81, 94	`Count(...filter=Q(result=JudgeStatus.ACCEPTED))` → `Q(result__in=[...])`

Total: 28 backend changes + 3 no-change confirmations = 31 audit points

7. Frontend Changes

7.1 Status Code Registration

ojnext/src/utils/constants.ts:

// SubmissionStatus enum
ast_check_failed = 10,

// JUDGE_STATUS object
"10": {
  name: "代码检查未通过",
  type: "warning",
},

ojnext/src/utils/types.ts line 68 — add | 10 to SUBMISSION_RESULT type.

7.2 SubmissionResult.vue

Line	What	Change
37-38	Shows `err_info` for `compile_error`, `runtime_error`	Add `ast_check_failed`
110-112	Shows test case details	Add `ast_check_failed`
119	`item.result === 0` in test case filter	No change — individual test cases are result 0

7.3 SubmitCode.vue

Line	What	Change
152	Confetti on AC	NO CHANGE — no celebration for AST fail
162	`if (result !== SubmissionStatus.accepted) return` — sets `my_status = 0`	Add `ast_check_failed`: `if (result !== SubmissionStatus.accepted && result !== SubmissionStatus.ast_check_failed) return`

Without the line 162 fix, the problem stays "unsolved" in the sidebar until page refresh.

7.4 Other Frontend (no changes needed)

Component	Why
`oj/api.ts` (my_status check)	Backend stores 0 in profile, not 10
`ProblemComment.vue` (my_status check)	Same reason
`ProblemInfo.vue` (statistic chart)	Covered by constants.ts — chart auto-shows new status
`useSubmissionMonitor.ts` (WebSocket)	Only treats result 9 as "still processing"

7.5 Admin UI

In the problem edit page, add a collapsible "代码规则检查" section:

Language tabs: Only show tabs for languages enabled on this problem
Rule list per language: Each rule is a row with:
- Engine dropdown (grouped: 节点检查 / 函数调用 / 运算符 / 结构检查 / 导入…)
- Target dropdown/input (context-dependent on engine type)
- Optional parameters: min, max, value (shown when engine uses them)
- Message input (with auto-generated default)
- Delete button
Add rule button per language tab
Collapsed by default (most problems won't have AST rules)

8. Contest Behavior

Contests and regular problems use the same AST check logic. No contest_id guard — if the contest problem has ast_rules, AST check runs.

AST_CHECK_FAILED counts as AC for contest ranking (ACM accepted_number, penalty time)
When adding a bank problem to a contest, ast_rules is copied. Contest creator can edit/clear rules on the contest copy.
update_contest_problem_status() and _update_acm_contest_rank() use is_accepted().
All contests currently use ACM mode only. OI code paths are updated for correctness but lower risk.

9. Legacy Data & Migration

Migration

One Django migration (additive, no data migration):

Add ast_rules JSONField (null=True) to Problem model
Add AST_CHECK_FAILED = 10 to JudgeStatus

Existing problems get ast_rules=null (no AST checking).

Legacy Data Policy

Existing submissions are not retroactively checked. Only new submissions after rules are added.
accepted_number and statistic_info keep current values. statistic_info will naturally accumulate "10" entries as new AST_CHECK_FAILED submissions come in.
Phase 2: optional "AST re-check" admin action — not in Phase 1.

10. Dependencies

tree-sitter
tree-sitter-python
tree-sitter-c
tree-sitter-cpp
tree-sitter-java
tree-sitter-go
tree-sitter-javascript

Pure Python wheels with pre-compiled grammars, no system dependencies.

11. Phased Delivery

Phase 1 (MVP)

Rule engine framework + check_ast() entry point
Python3 mapping (most complete, matches full rule catalog)
C mapping (second priority)
Engines: must_exist_node, must_not_exist_node, count_node, must_call_function, must_not_call_function, count_function_call, must_call_method, must_not_call_method, must_use_operator
JudgeDispatcher integration (AST check + all 28 statistics changes)
Frontend: status code, result display, admin UI
Migration

Phase 2

Engines: must_use_keyword_arg, must_import/must_not_import, must_use_variable_name/must_not_use_variable_name
Structural engines: nested_for, chained_comparison, swap_assignment, chain_assignment, must_use_recursion, no_recursion
C++ mapping

Phase 3

Java, Go, JavaScript mappings
String literal content checks (format specifiers)
Optional "AST re-check" admin action for existing AC submissions

19 KiB Raw Blame History