C-PlagAST: AST-CC Based Plagiarism Detection Tool for C Code
C-PlagAST is an advanced plagiarism detection tool built for C programs. It leverages the AST-CC (Abstract Syntax Tree – Consistent Comparison) algorithm to identify structural similarities between code files, making it robust against superficial changes such as variable renaming, formatting variations, and reordering of functions or declarations.
Unlike traditional text-based comparison tools, C-PlagAST parses source code into abstract syntax trees, applies a series of normalization techniques (such as dead code elimination, function and declaration reordering, and prototype removal), and then generates structural hashes or similarity scores to evaluate the likelihood of plagiarism.
This tool is intended for use in academic and professional environments where accurate and structure-aware code plagiarism detection is essential.
-
AST-based structural comparison of C programs
-
Supports normalization techniques:
- Dead code and unreachable code removal
- Declaration and function reordering
- Prototype elimination
-
Confusion matrix and accuracy reporting
-
CLI-based execution with support for batch testing
-
Ideal for academic plagiarism detection in C programming assignments
This project supports multiple build options for Linux, WSL, and Windows.
Before building, make sure required packages are installed:
sudo apt update && sudo apt install bison g++ makeThis installs:
bison– for generating the parserg++– for compiling C++ codemake– for using the Makefile
To clean previous builds:
make cleanmakebash build.shTo check for plagiarism:
./bin/detector_with_filtering.exe original.c --ast-cc-test suspected1.c suspected2.c suspected3.cThe plagiarism will be checked by comparing the added files after the --ast-cc-test flag against original.c .
To print the Normalized AST for debugging:
./bin/detector_with_filtering.exe --printAST test1.c test2.c test3.cRun in File Explorer or via Command Prompt:
.\build.bat
⚠️ Ensurebisonandg++are available in yourPATH. Use MSYS2 or WSL if necessary.
To check for plagiarism:
bin\detector_with_filtering.exe original.c --ast-cc-test suspected1.c suspected2.c suspected3.cThe plagiarism will be checked by comparing the added files after the --ast-cc-test flag against original.c .
To print the normalized AST:
bin\detector_with_filtering.exe --printAST test1.c test2.c test3.c-
Executables will be generated in the
bin/directory:detector_with_filtering.exe– Main detector with normalization and filteringc_parser.exe– Optional standalone parser binary
-
build will be generated upon compilation in the
build/directory:parser.cppparser.hppparser.output