Skip to content

bmaca01/cpp-interpreter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C++ Interpreter

A simple interpreter for a custom scripting language, implemented in C++ using classic compiler design principles.

Overview

This project implements a tree-walking interpreter that processes a dynamically-typed scripting language. The interpreter follows a three-stage architecture:

  1. Lexical Analysis - Tokenizes input into meaningful symbols
  2. Parsing - Builds an Abstract Syntax Tree (AST) using recursive descent
  3. Evaluation - Executes the program by traversing the AST

Language Features

Data Types

The language supports two dynamic types:

  • Integers - Whole numbers (e.g., 42, -17, 0)
  • Strings - Text literals in double quotes (e.g., "hello", "world\n")

Variables are dynamically typed and can hold either type.

Operators

Arithmetic/String Operators:

  • + - Addition (integers) or concatenation (strings)
  • - - Subtraction (integers) or substring removal (strings)
  • * - Multiplication (integers) or string repetition (string * int)
  • / - Integer division
  • = - Assignment (right-associative)

Operator Precedence (highest to lowest):

  1. *, /
  2. +, -
  3. =

Statements

  • print <expr> - Output expression value without newline
  • println <expr> - Output expression value with newline
  • repeat <count> begin <statements> end - Loop construct
  • <variable> = <expr> - Variable assignment

All statements must end with a semicolon (;).

Comments

Single-line comments begin with //:

// This is a comment
x = 5;  // Assign 5 to x

Language Examples

Basic Arithmetic:

x = 10;
y = 20;
println x + y;  // Outputs: 30

String Operations:

name = "Alice";
greeting = "Hello, " + name;
println greeting;  // Outputs: Hello, Alice

// String repetition
stars = "*" * 5;
println stars;  // Outputs: *****

// Substring removal
text = "banana";
result = text - "na";
println result;  // Outputs: bana (removes first occurrence)

Loops:

repeat 3 begin
    println "Iteration";
end;

Chained Assignments:

a = b = c = 100;  // Right-to-left: c=100, b=100, a=100
println a;        // Outputs: 100

Complex Example:

// Fibonacci-like sequence
a = 0;
b = 1;
println a;
println b;

repeat 5 begin
    temp = a + b;
    println temp;
    a = b;
    b = temp;
end;

Architecture

Components

Lexer (lex.h, lex.cpp)

  • Finite state machine tokenizer
  • Converts source code into token stream
  • Tracks line numbers for error reporting
  • Supports escape sequences in strings (\n, etc.)

Parser (parse.h, parse.cpp)

  • Recursive descent parser with one-token lookahead
  • Builds Abstract Syntax Tree (AST)
  • Enforces grammar rules and operator precedence
  • Continues parsing after errors to report multiple issues

Parse Tree (pt.h)

  • Defines AST node types (expressions, statements, literals)
  • Implements tree-walking interpreter via Eval() methods
  • Each node type knows how to evaluate itself

Value System (val.h, val.cpp)

  • Dynamic typing with runtime type checking
  • Operator overloading for type-specific behaviors
  • Type coercion rules for mixed operations

Main Driver (main.cpp)

  • Command-line argument processing
  • Parse tree statistics reporting
  • Execution orchestration

Grammar (BNF)

Prog       → Sl
Sl         → Stmt ';' Sl | Stmt ';'
Stmt       → PrintStmt | PrintlnStmt | RepeatStmt | Expr
PrintStmt  → 'print' Expr
PrintlnStmt→ 'println' Expr
RepeatStmt → 'repeat' Expr 'begin' Stmt+ 'end'
Expr       → Sum ('=' (Sum | Expr))*
Sum        → Prod (('+' | '-') Prod)*
Prod       → Primary (('*' | '/') Primary)*
Primary    → '(' Expr ')' | IDENT | ICONST | SCONST

Building

Requirements

  • C++11 compatible compiler (g++)
  • Standard C++ library (no external dependencies)

Compilation

make

This produces an executable named program.

To clean build artifacts:

make clean

Usage

From Standard Input

./program

Type your program, then press Ctrl+D (EOF) to execute.

From File

./program myprogram.txt

Command-line Options

  • -v - Verbose mode (TODO)
  • <filename> - Read program from file

Output

After execution, the interpreter prints statistics:

PLUS COUNT: <number of + operations>
EQ COUNT: <number of = operations>
MAX DEPTH: <maximum AST depth>

Error Handling

Parse Errors:

  • Reported with line numbers
  • Parser continues to find additional errors
  • Program does not execute if parse errors exist

Runtime Errors:

  • Type mismatches (e.g., adding string to integer)
  • Division by zero
  • Undefined variable access
  • Invalid repeat counts (non-integer or negative)

Example error output:

RUNTIME ERROR: Repeat statement count must be an integer

Language Constraints

  • No conditionals - No if/else statements
  • No functions - Cannot define subroutines
  • No input - No read or input statements
  • Global scope only - All variables are global
  • Integer math only - No floating-point numbers
  • No arrays - Only scalar values
  • No comparison operators - No <, >, ==, etc.

Implementation Notes

Type Coercion

Operations check types at runtime:

  • int + int → integer addition
  • string + string → concatenation
  • int * int → multiplication
  • string * int or int * string → repetition
  • int - int → subtraction
  • string - string → remove first occurrence of substring

Mixed-type operations (e.g., int + string) produce runtime errors.

String Subtraction

String subtraction removes only the first occurrence of the substring:

text = "banana";
result = text - "na";  // "bana", not "ba"

Assignment Semantics

Assignment is right-associative and returns a value:

x = y = 5;  // Equivalent to: y = 5; x = y;

Assignment in parentheses is not allowed:

z = (x = 5);  // PARSE ERROR

Repeat Statement Restrictions

The body of a repeat statement must contain actual statements (print, assignment, or nested repeat), not bare expressions:

repeat 3 begin
    println "ok";  // Valid
end;

repeat 3 begin
    42;  // INVALID - bare expression
end;

Project Structure

cpp-interpreter/
├── lex.h / lex.cpp      # Lexical analyzer
├── parse.h / parse.cpp  # Parser
├── pt.h                 # AST nodes and interpreter
├── val.h / val.cpp      # Value type system
├── main.cpp             # Driver program
├── io.cpp               # I/O utilities
├── Makefile             # Build configuration
└── README.md

License

Educational/academic project.

About

From CS 280

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published