Files

yuetsh 953b4dc851 docs: add AST checker design spec

Tree-sitter based code structure validation for submissions,
with per-problem per-language rule configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-25 08:31:25 -06:00

14 KiB

Raw Blame History

AST Checker Design Spec

Overview

Add tree-sitter-based AST checking to the Online Judge submission flow. Teachers can configure per-problem, per-language rules that validate student code structure before judging (e.g., "must use while loop", "cannot use for loop", "must call print()").

Goals

Enforce coding constraints for pedagogical purposes (beginner programming courses)
Support all 6 languages: Python3, C, C++, Java, Golang, JavaScript
Predefined rule library with parameterized engines (no raw tree-sitter queries for admins)
Full admin UI for configuring rules per problem per language
New AST_CHECK_FAILED judge status with clear error messages

Non-Goals

Output-aware checks ("禁止直接输出完整目标答案") — requires expected output, not AST
String literal content matching (.2f, 03d format specifiers) — deferred to a later phase
Custom tree-sitter query support for admins

Architecture

Submission Flow (modified)

SubmissionAPI.post()
  → create Submission(PENDING)
  → judge_task.send()
    → JudgeDispatcher.judge()
      → apply code template
      → **AST check** ← NEW
        → fail: result=AST_CHECK_FAILED, write err_info, push WebSocket, return
        → pass: continue
      → choose judge server
      → send to judge server
      → process result

AST check runs inside the Dramatiq task, after template application and before judge server dispatch. This is consistent with how COMPILE_ERROR is handled — the submission exists in history, the result is pushed via WebSocket.

Data Model

Problem model — new JSONField:

ast_rules = models.JSONField(null=True, blank=True, default=None)

Schema:

{
  "Python3": [
    {"engine": "must_exist_node", "target": "for_loop", "message": "必须使用 for 循环"},
    {"engine": "count_node", "target": "while_loop", "min": 2, "message": "while 循环至少出现 2 次"},
    {"engine": "must_call_function", "target": "print", "message": "必须调用 print()"},
    {"engine": "must_use_operator", "target": "+=", "message": "必须使用 += 运算符"},
    {"engine": "must_call_method", "target": "append", "message": "必须使用 append()"}
  ],
  "C": [
    {"engine": "must_exist_node", "target": "for_loop", "message": "必须使用 for 循环"}
  ]
}

Key design: target uses language-agnostic logical names (e.g., for_loop, while_loop, print). Each engine maps these to language-specific tree-sitter node types internally.

When ast_rules is null or the current language has no rules, AST checking is skipped entirely.

JudgeStatus — new status code:

class JudgeStatus(models.IntegerChoices):
    COMPILE_ERROR = -2, "Compile Error"
    WRONG_ANSWER = -1, "Wrong Answer"
    ACCEPTED = 0, "Accepted"
    CPU_TIME_LIMIT_EXCEEDED = 1, "CPU Time Limit Exceeded"
    REAL_TIME_LIMIT_EXCEEDED = 2, "Real Time Limit Exceeded"
    MEMORY_LIMIT_EXCEEDED = 3, "Memory Limit Exceeded"
    RUNTIME_ERROR = 4, "Runtime Error"
    SYSTEM_ERROR = 5, "System Error"
    PENDING = 6, "Pending"
    JUDGING = 7, "Judging"
    PARTIALLY_ACCEPTED = 8, "Partially Accepted"
    AST_CHECK_FAILED = 9, "AST Check Failed"   # NEW

Frontend constants.ts must be updated with the new status code, label, and color.

Rule Engine Architecture

Directory Structure

OnlineJudge/ast_checker/
├── __init__.py
├── checker.py              # Entry point: check(code, language, rules) → (ok, errors)
├── engines/
│   ├── __init__.py         # Engine registry
│   ├── base.py             # BaseEngine abstract class
│   ├── node_exists.py      # must_exist_node / must_not_exist_node
│   ├── node_count.py       # count_node
│   ├── function_call.py    # must_call_function / must_not_call_function / count_function_call
│   ├── method_call.py      # must_call_method / must_not_call_method
│   ├── operator.py         # must_use_operator
│   ├── keyword_arg.py      # must_use_keyword_arg
│   ├── import_check.py     # must_import / must_not_import
│   └── structural.py       # nested_for, chained_comparison, swap_assignment, etc.
└── mappings/
    ├── __init__.py          # get_mapping(language) dispatcher
    ├── python.py
    ├── c.py
    ├── cpp.py
    ├── java.py
    ├── go.py
    └── javascript.py

Engine Interface

class BaseEngine:
    def check(self, tree, rule, language, mapping) -> list[str]:
        """
        Returns a list of error messages (empty = pass).
        - tree: tree-sitter parsed tree
        - rule: the rule dict (engine, target, message, min, max, value, etc.)
        - language: language name string
        - mapping: language-specific node type mapping dict
        """
        raise NotImplementedError

Engine Catalog

Engine Name	Parameters	Description
`must_exist_node`	`target`	Node type must appear at least once
`must_not_exist_node`	`target`	Node type must not appear
`count_node`	`target`, `min?`, `max?`	Node type count must be within [min, max]
`must_call_function`	`target`	Must call a specific function (e.g., `print`, `input`)
`must_not_call_function`	`target`	Must not call a specific function
`count_function_call`	`target`, `min?`, `max?`	Function call count within range
`must_call_method`	`target`	Must call a method (e.g., `.append()`, `.split()`)
`must_not_call_method`	`target`	Must not call a method
`must_use_operator`	`target`, `category?`	Must use a specific operator. Category auto-inferred from target: arithmetic (`+`,`-`,``,`/`,`//`,`%`,`*`) → search in binary expressions; augmented (`+=`,`-=`) → search in augmented assignments; comparison (`==`,`!=`,`>`,`>=`,`<`,`<=`) → search in comparisons; logical (`and`,`or`,`not`) → search in boolean/unary expressions; bitwise (`&`,`\|`) → search in binary expressions
`must_use_keyword_arg`	`target` (function), `arg_name`, `value?`	Must use keyword arg in a call
`must_import`	`target`	Must import a specific module
`must_not_import`	`target`	Must not import a specific module
`must_use_variable_name`	`target`	Must assign to a variable with this name
`must_not_use_variable_name`	`target`	Must not assign to a variable with this name
`nested_for`	—	Must have a for loop nested inside another for loop
`chained_comparison`	—	Must use chained comparison (e.g., `a < b < c`)
`swap_assignment`	—	Must use swap assignment (e.g., `a, b = b, a`)
`chain_assignment`	—	Must use chain assignment (e.g., `a = b = 1`)
`must_use_recursion`	—	Must have a function that calls itself
`no_recursion`	—	No function may call itself

Language Mapping

Each mapping file exports a dict translating logical names to tree-sitter node types:

# mappings/python.py
PYTHON_MAPPING = {
    # Node types
    "for_loop": "for_statement",
    "while_loop": "while_statement",
    "if_statement": "if_statement",
    "else_clause": "else_clause",
    "elif_clause": "elif_clause",
    "break": "break_statement",
    "continue": "continue_statement",
    "function_definition": "function_definition",
    "return": "return_statement",
    "try_except": "try_statement",
    "with_statement": "with_statement",
    "list_comprehension": "list_comprehension",
    "list_literal": "list",
    "dict_literal": "dictionary",
    "set_literal": "set",
    "f_string": "format_string",
    "import": "import_statement",
    "import_from": "import_from_statement",
    "assignment": "assignment",
    "class_definition": "class_definition",

    # Operators
    "+": "+",
    "-": "-",
    "*": "*",
    "/": "/",
    "//": "//",
    "%": "%",
    "**": "**",
    "+=": "+=",
    "-=": "-=",
    "==": "==",
    "!=": "!=",
    ">": ">",
    ">=": ">=",
    "<": "<",
    "<=": "<=",
    "and": "and",
    "or": "or",
    "not": "not",
    "&": "&",
    "|": "|",
}

# mappings/c.py
C_MAPPING = {
    "for_loop": "for_statement",
    "while_loop": "while_statement",
    "if_statement": "if_statement",
    "else_clause": "else_clause",
    "break": "break_statement",
    "continue": "continue_statement",
    "function_definition": "function_definition",
    "return": "return_statement",
    "assignment": "assignment_expression",
    # ... C-specific mappings
}

Entry Point

# checker.py
def check_ast(code: str, language: str, rules: list[dict]) -> tuple[bool, list[str]]:
    """
    Parse code with tree-sitter, run all rules, return (passed, error_messages).
    If rules is empty, returns (True, []).
    If tree-sitter fails to parse (syntax error), returns (True, []) — skip AST
    check and let the compiler report the error downstream.
    """

Known Limitations

Method call detection is name-based only: must_call_method("append") matches any .append() call regardless of object type. tree-sitter provides no type information. Acceptable for teaching scenarios.
Structural rules are language-specific: swap_assignment, chained_comparison, chain_assignment only apply to Python. The engine should return (pass) for unsupported languages rather than erroring.

Integration in JudgeDispatcher

# In JudgeDispatcher.judge(), after template application:
def judge(self):
    language = self.submission.language
    sub_config = list(filter(...))[0]

    if language in self.problem.template:
        template = parse_problem_template(self.problem.template[language])
        code = f"{template['prepend']}\n{self.submission.code}\n{template['append']}"
    else:
        code = self.submission.code

    # --- AST CHECK (NEW) ---
    ast_rules = self.problem.ast_rules
    if ast_rules and language in ast_rules:
        from ast_checker.checker import check_ast
        passed, errors = check_ast(self.submission.code, language, ast_rules[language])
        if not passed:
            self.submission.result = JudgeStatus.AST_CHECK_FAILED
            self.submission.statistic_info["err_info"] = "\n".join(errors)
            self.submission.statistic_info["score"] = 0
            self.submission.save(update_fields=["result", "info", "statistic_info"])
            try:
                push_submission_update(
                    submission_id=str(self.submission.id),
                    user_id=self.submission.user_id,
                    data={
                        "type": "submission_update",
                        "submission_id": str(self.submission.id),
                        "result": JudgeStatus.AST_CHECK_FAILED,
                        "status": "finished",
                    }
                )
            except Exception as e:
                logger.error(f"Failed to push submission update: {str(e)}")
            return
    # --- END AST CHECK ---

    # ... continue with judge server dispatch

Note: AST check runs on self.submission.code (raw student code), not the template-wrapped code, because the template prepend/append is not student-written.

Frontend Changes

Admin UI (ojnext)

In the problem edit page, add a collapsible "代码规则检查" section:

Language tabs: Only show tabs for languages selected in this problem's languages field
Rule list per language: Each rule is a row with:
- Engine dropdown (grouped by category: 节点检查 / 函数调用 / 运算符 / 结构检查 / 导入…)
- Target dropdown/input (context-dependent: node types for node engines, function names for call engines, operators for operator engine)
- Optional parameters: min, max, value fields (shown only when the selected engine uses them)
- Message input (custom error message, with auto-generated default)
- Delete button
Add rule button per language tab
Section is collapsed by default (most problems won't have AST rules)

Status Display

Add AST_CHECK_FAILED = 9 to ojnext/src/utils/constants.ts
Assign a distinct color (suggest orange, between COMPILE_ERROR red and PENDING grey)
Label: "AST Check Failed" / "代码结构检查未通过"
When viewing a submission with this status, display statistic_info.err_info as the error detail (same rendering as COMPILE_ERROR)

Dependencies

Backend (add to pyproject.toml / requirements):

tree-sitter (Python bindings)
tree-sitter-python
tree-sitter-c
tree-sitter-cpp
tree-sitter-java
tree-sitter-go
tree-sitter-javascript

These are pure Python wheels with pre-compiled grammars, no system dependencies needed.

Migration

One Django migration:

Add ast_rules JSONField (null=True) to Problem model
Add AST_CHECK_FAILED = 9 to JudgeStatus

Both are additive, no data migration needed. Existing problems get ast_rules=null (no AST checking).

Phased Delivery

Note: Most problems use Python3 and C. Prioritize these two languages.

Phase 1 (MVP)

Rule engine framework + checker entry point
Python3 mapping (most complete, matches the full rule catalog)
C mapping (second priority, covers the most-used language pair)
Engines: must_exist_node, must_not_exist_node, count_node, must_call_function, must_not_call_function, count_function_call, must_call_method, must_not_call_method, must_use_operator
JudgeDispatcher integration
Frontend: status code + admin UI
Migration

Phase 2

Remaining engines: must_use_keyword_arg, must_import/must_not_import, must_use_variable_name/must_not_use_variable_name
Structural engines: nested_for, chained_comparison (Python only), swap_assignment (Python only), chain_assignment (Python only), must_use_recursion, no_recursion
C++ mapping (shares most structure with C)

Phase 3

Java, Go, JavaScript mappings
String literal content checks (format specifiers)
Additional structural rules as needed

14 KiB Raw Blame History