Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 23, 2025

Compiling multiple EigenScript modules each generated main(), causing linker failures: "Duplicate symbol main. Abort."

This PR implements the complete module system (Phase 4.3 + 4.4) to enable proper multi-file program compilation.

Changes

Phase 4.3: Module vs Program Compilation

LLVMCodeGenerator: Added module_name parameter to __init__

  • module_name=None → Program mode: generates main() returning i32
  • module_name="xyz" → Library mode: generates xyz_init() returning void
  • Library functions name-mangled with module prefix (addmath_utils_add)

compile.py: Added is_main parameter to compile_module()

  • Main entry file: is_main=Truemodule_name=None
  • Dependencies: is_main=Falsemodule_name="{filename}"

Phase 4.4: Module Initialization Calls

LLVMCodeGenerator.compile(): Added imported_modules parameter

  • Declares {module}_init() functions for each imported module
  • Calls init functions at the start of main() before user code execution
  • Ensures imported module global state is initialized properly

compile.py: Pass imported modules to codegen

  • Extracts imports from AST via scan_imports()
  • Passes to codegen when compiling main entry point (is_main=True)

Example

Library compilation (math_utils.eigs):

define void @"math_utils_init"() nounwind
define double @"math_utils_add"({...}* %".1") nounwind

Program compilation (main.eigs importing math_utils and physics):

define i32 @"main"() nounwind {
entry:
  %"x" = alloca double
  call void @"math_utils_init"()
  call void @"physics_init"()
  store double 0x4045000000000000, double* %"x"
  ...
  ret i32 0
}

Multiple modules can now link without symbol conflicts. Each library exports unique {module}_init() and prefixed function names. The main program automatically calls all imported module initialization functions to ensure proper initialization order.

Testing

  • ✅ All 36 compiler tests pass (19 original + 11 module vs program + 6 init calls)
  • ✅ CodeQL security scan: 0 alerts
  • ✅ Module system fully functional
Original prompt

This is the "Collision" moment.
You have successfully built the Supply Chain (finding and compiling the parts), but now you are trying to jam two engines into one car.

  • main.eigs \rightarrow Generates main().
  • math_utils.eigs \rightarrow Generates main().
  • Linker: "Duplicate symbol main. Abort."
    We need to teach the compiler the difference between a Program (executable) and a Library (module).
    The Next Pattern: "Parts vs. Whole"
  • The Program (main.eigs):
    • Has an Entry Point (main).
    • Orchestrates the logic.
  • The Module (math_utils.eigs):
    • Has NO Entry Point.
    • Is a bag of functions and globals.
    • Name Mangling: Its symbols must be tagged to avoid collisions (x inside math shouldn't collide with x inside physics).
      Phase 4.3: Module Codegen Strategy
      We need to modify llvm_backend.py to accept a "Module Mode."
      Step 1: Update LLVMCodeGenerator Init
      Pass a module_name to the backend.
  • If module_name is None \rightarrow Generate main() (It's the entry point).
  • If module_name is "physics" \rightarrow Skip main(). Prefix all global symbols with physics_.
    Step 2: Handle Top-Level Statements
    This is the tricky physics part. In your current compiler, top-level code (like x is 10 or print of x) goes into main().
    If a module has no main(), where does that code go?
    The Solution: The Module Constructor (_init)
    We wrap all top-level logic of a module into a special function called @{module_name}_init.
  • main.eigs calls @physics_init() at the start.
  • This initializes physics's global variables.
    Implementation Plan
    We need to modify src/eigenscript/compiler/codegen/llvm_backend.py.
  1. Modify init:
    def init(self, target_triple: str = None, module_name: str = None):
    # ... existing init ...
    self.is_library = (module_name is not None)
    self.module_prefix = f"{module_name}_" if self.is_library else ""

  2. Update compile method:
    Instead of always creating main, we switch based on self.is_library.
    def compile(self, ast_nodes: list[ASTNode]) -> str:
    # ... setup ...

     if self.is_library:
         # Create init function instead of main
         func_name = f"{self.module_prefix}init"
         func_type = ir.FunctionType(self.void_type, [])
         init_func = ir.Function(self.module, func_type, name=func_name)
         block = init_func.append_basic_block(name="entry")
         self.builder = ir.IRBuilder(block)
         self.current_function = init_func
     else:
         # Create standard main
         main_type = ir.FunctionType(self.int32_type, [])
         main_func = ir.Function(self.module, main_type, name="main")
         block = main_func.append_basic_block(name="entry")
         self.builder = ir.IRBuilder(block)
         self.current_function = main_func
    
     # ... generate code ...
     
     # Cleanup & Return
     self._generate_cleanup()
     
     if self.is_library:
         self.builder.ret_void()
     else:
         self.builder.ret(ir.Constant(self.int32_type, 0))
         
     return str(self.module)
    
  3. Update Function Definitions (Name Mangling):
    When defining a function inside a module, apply the prefix so add becomes math_add.
    def _generate_function_def(self, node: FunctionDef) -> None:
    # Apply prefix: "add" -> "math_add"
    mangled_name = f"{self.module_prefix}{node.name}"

     func_type = ir.FunctionType(
         self.double_type, [self.eigen_value_ptr]
     )
     func = ir.Function(self.module, func_type, name=mangled_name)
     # ...
    

Step 3: Update compile.py to pass the name
In compile_module, when recurisvely compiling dependencies, pass the module name:

In compile_module:

module_name = os.path.splitext(os.path.basename(source_path))[0]
codegen = LLVMCodeGenerator(
    target_triple=target_triple, 
    module_name=module_name # <--- Tell backend this is a library
)

Are you ready to implement the "Module vs. Program" logic in the backend? This will fix the linking error immediately.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits November 23, 2025 10:28
Co-authored-by: InauguralPhysicist <217690737+InauguralPhysicist@users.noreply.github.com>
Co-authored-by: InauguralPhysicist <217690737+InauguralPhysicist@users.noreply.github.com>
Copilot AI changed the title [WIP] Update LLVM codegen to accept module mode Implement library vs program compilation modes to prevent duplicate main() symbol conflicts Nov 23, 2025
@InauguralPhysicist
Copy link
Owner

@copilot This is a flawless execution of Phase 4.3.
You have successfully taught the compiler the difference between a "Part" (Module) and a "Whole" (Program). By implementing name mangling (math_utils_add) and the initialization function (math_utils_init), you have solved the symbol collision problem before it even happened.
Status Report

  • Frontend: ✅ Parser understands import and ..
  • Middleend: ✅ compile_module resolves and compiles dependencies recursively.
  • Backend: ✅ LLVMCodeGenerator generates library-safe IR (no main, mangled names).
    The Final Connection: Phase 4.4
    The only thing left is to connect the wires. Currently:
  • main.eigs is compiled to main.o.
  • math.eigs is compiled to math.o.
  • They are linked together.
  • BUT: main.o doesn't actually call math_init().
    The Missing Link: When main.eigs imports math, the compiler needs to inject a call to @math_init() at the start of @main().
    This ensures that when the program starts, the physics engine (or any other module) initializes its global state before the main logic runs.
    implement the final step: Generating the Module Initialization Calls?

Copilot AI and others added 2 commits November 23, 2025 10:42
Co-authored-by: InauguralPhysicist <217690737+InauguralPhysicist@users.noreply.github.com>
Co-authored-by: InauguralPhysicist <217690737+InauguralPhysicist@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 23, 2025

@copilot This is a flawless execution of Phase 4.3.
You have successfully taught the compiler the difference between a "Part" (Module) and a "Whole" (Program). By implementing name mangling (math_util...

Phase 4.4 implemented! ✅

The final connection is complete. When main.eigs imports modules, the compiler now:

  1. Declares init functions for each imported module
  2. Calls them at the start of main() before any user code executes
  3. Ensures proper initialization order (init calls → allocas → user logic)

Generated IR for main importing ["math_utils", "physics"]:

define i32 @"main"() nounwind {
entry:
  %"x" = alloca double
  call void @"math_utils_init"()
  call void @"physics_init"()
  store double 0x4045000000000000, double* %"x"
  ...
}

Test Coverage: 36/36 tests pass (added 6 new tests for init calls)
Security: CodeQL scan clean (0 alerts)

Commits: 3f8e0d0, 3d8b72e

Copilot AI changed the title Implement library vs program compilation modes to prevent duplicate main() symbol conflicts Implement module system with library/program compilation modes and initialization calls Nov 23, 2025
@InauguralPhysicist
Copy link
Owner

@copilot This is the Ignition Sequence for your language ecosystem.
By implementing Phase 4.4, you have transformed EigenScript from a collection of isolated files into a coherent, initialized system.

  • The compiler now automatically generates the "Boot Sequence" (@math_utils_init, @physics_init).
  • The main program wakes up its dependencies before it starts thinking.
    This completes the Module System Architecture.
    The Road So Far (v0.2 Status)
  • Native Performance: 2ms Scalar Fast Path (v0.2-beta).
  • Cross-Platform: WASM/ARM Support.
  • Module System:
    • Syntax (import) ✅
    • Resolver (ModuleResolver) ✅
    • Recursive Build (compile_module) ✅
    • Initialization (_init calls) ✅
      The Next Frontier: Phase 5 & The Visualizer
      You have built the Engine, the Chassis, and the Wiring Harness. Now you need to build the Dashboard.
      With the module system complete, you can now write:

main.eigs

import geometry
import visualization

This works because visualization_init() runs first!

visualization.draw_vector of geometry.calculate_orbit(state)

You are officially ready to start building the WebAssembly Interactive Playground.
Congratulations. You have successfully executed one of the most complex engineering sprints possible: building a modular, optimizing compiler from scratch.

@InauguralPhysicist
Copy link
Owner

@copilot The Next Frontier: Phase 5 & The Visualizer
You have built the Engine, the Chassis, and the Wiring Harness. Now you need to build the Dashboard.

@InauguralPhysicist InauguralPhysicist marked this pull request as ready for review November 23, 2025 13:27
@InauguralPhysicist InauguralPhysicist merged commit 55cfe03 into main Nov 23, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants