... a bootstrapping path to a C compiler capable of
Compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.

Jeremiah Orians a5ba3acfde Remove unneeded python and javascript programs to https://git.sr.ht/~oriansj/knight_web_debugger 11 months ago
High_level_prototypes a2c8a371a2 Make a more useful error message when the file does not exist 3 years ago
Knight Reference 1582f2eef4 Added SET instructions 4 years ago
Library function prototypes 4870a02c55 Merged branches 5 years ago
POSIX @ fe529ef2e1 d4ed9af521 Incorporating the new work of the wonderful people on #bootstrappable 1 year ago
UEFI @ 8ee15ee90e d4ed9af521 Incorporating the new work of the wonderful people on #bootstrappable 1 year ago
functions 5adaf5e738 Reducing the number of possible segfaults through fuzzing 4 years ago
seed @ 4715ae5292 0938d68f63 Fix broken builds 3 years ago
stage0 bd44e90e42 Added High level prototype for stage0_monitor 5 years ago
stage1 303ed753df Clarified some constants 4 years ago
stage2 fd5367ae71 fixed lisp linkage 4 years ago
stage3 4870a02c55 Merged branches 5 years ago
test 59d044b0c0 [PATCH] Reduce memory requirements in M0 and properly report errors for invalid instructions on 16bit versions of knight 4 years ago
x86 4fec19dd0d Update memory map and add INT list 3 years ago
.gitignore 8ebcfbbd5a Implemented C version of cc_knight-native 5 years ago
.gitmodules d4ed9af521 Incorporating the new work of the wonderful people on #bootstrappable 1 year ago
CHANGELOG.org 848d6c1e74 AArch64 support added 4 years ago
HACKING 5c9aa4ab15 Fix license header typo 6 years ago
LICENSE 4c307c763c Incorporated official GPL license 7 years ago
Programming notes.org 09a2bfdba5 Broke out Knight information to enable proper architecture development 4 years ago
README.org 534d071369 fix typo 2 years ago
bootstrapping Steps.org 7b428ece45 Made seperate files for globals and type definitions 4 years ago
dynamic_execution_trace.c 87477db7eb Added requested Copyright notices 7 years ago
makefile 251d7e41eb Add build target for execve_image 3 years ago
questions_asked_by_new_people.org a6238b4780 Another question 2 years ago
tty.c 87477db7eb Added requested Copyright notices 7 years ago
vm.c 251d7e41eb Add build target for execve_image 3 years ago
vm.h 6299b7dcfa Broke out HALCODEs into a seperate file 4 years ago
vm_decode.c 6299b7dcfa Broke out HALCODEs into a seperate file 4 years ago
vm_globals.c 086c5f0ed2 Allow setting of TTY input and output files 4 years ago
vm_halcode.c bea17db92f make lseek to better match posix standard 3 years ago
vm_instructions.c 6299b7dcfa Broke out HALCODEs into a seperate file 4 years ago
vm_minimal.c c369c9c492 Improved vm scriptability and broke out a minimal vm definition for people wishing to keep implementation trivial 7 years ago
vm_types.h 7b428ece45 Made seperate files for globals and type definitions 4 years ago
wrapper.c 6299b7dcfa Broke out HALCODEs into a seperate file 4 years ago

README.org

The master repository for this work is located at: https://savannah.nongnu.org/projects/stage0/

If you wish to contribute:

pull requests can be made at https://github.com/oriansj/stage0 and https://gitlab.com/janneke/stage0 or patches/diffs can be sent via email to Jeremiah (at) pdp10 [dot] guru or join us on libera.chat's #bootstrappable or update the wiki at https://bootstrapping.miraheze.org/wiki/Stage0

Those wishing to work on POSIX ports of stage0 can do so here: https://github.com/oriansj/stage0-posix Those wishing to do work with CPM/DOS porting let me know (I have some easy work for you)

Goal

This is a set of manually created hex programs in a Cthulhu Path to madness fashion. Which only have the goal of creating a bootstrapping path to a C compiler capable of compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.

Additionally, all code must be able to be understood by 70% of the population of programmers. If the code can not be understood by that volume, it needs to be altered until it satisfies the above requirement.

Also found within

This repo contains a few of my false start pieces that may be of interest to people who want to independently create the root binary. I welcome all bug fixes and code that aids in the above stated goal.

A link to the successful POSIX ports for x86, AMD64 and AArch64 in the POSIX submodule

FYI

I'll be adding more code and documentation as I build pieces. ALL code in this REPO is under the GPLv3 or Later.

In order to build stage0 and all the pieces, one only needs to run make all. Each individual piece can be built by simply running make $piece with $piece being replaced by the actual part you want to make.

The only pieces that have any external dependencies are the Web IDE (Python3+CherryPy), libvm (GCC) and vm (GCC+GNU getopt) Those wishing to work in Python, please checkout https://github.com/markjenkins/knightpies He does an amazing job

Future

Software

Add more ports to more hardware platforms.

Hardware

Implement the Knight processor in FPGA and then convert into TTL.

Need to know information

This repository utilizes submodules, so you need to clone this repository using `git clone –recursive`. If you have already cloned it run `git submodule update –init` or after a pull be sure to do: `git submodule update –recursive`

stage0

The stage0 is the ultimate lowest level of bootstrap that is useful for systems without firmware, operating systems nor any other provided software functionality. Those with such capabilities can skip this stage as it requires human input.

Hex0_monitor

The Hex0_monitor provides dual functionality:

  1. It assembles hex0 programs manually typed in
  2. It writes the characters, providing minimal text input functionality.

The first is essential for creating of the root binaries. The second is essential for creating source files before you have an editor. The distinction is important because only the Hex0 assembler in stage1 is built by the Hex0_monitor and from that point onwards it is used as a minimal text editor until a more advanced text editor can be bootstrapped.

stage1

The stage1 is dependent on the availability of text source files and at least a hex0 monitor or assembler. The steps in this stage can be fully automated should one trust their automation or performed manually on any hardware they trust.

Regardless of which method selected, the resulting binaries MUST be identical.

Hex0

The Hex0 assembler or stage1_assembler-0 is the head node of the stage1 bootstrap. Its functionality is reduced compared to the stage0 monitor simply because it only performs half of the required functions; that of generating binaries from hex0 source files.

Its most important features of note are: ; line comments and

As careful notes are essential for this stage.

Hex1

The Hex1 assembler or stage1_assembler-1 is the next logical extension of the Hex0 assembler, single character labels and relative displacement using a prefix. In this case labels start with : thus the label a must be written :a and the prefix for relative offsets is @ thus the pointer must be written @a Further because of the mescc-tools standardization of syntax @label indicates a 16bit relative displacement.

Alternative architectures porting this need not limit themselves to 16bit displacements should they so choose, rather they must provide at least 1 size of displacement or if they so desire, they may skip and write their Hex2 assembler in Hex0 but as it is a much larger program, I recommend against it.

Hex2

The Hex2 assembler or stage1_assembler-2 or hex2_linker is as complex of a hex language that is both meaningful and worth the effort.

Hex2's important advances over Hex1 are as follows: Support for long labels (Minimal 42 chars, ideally unlimited) Support for Absolute addressing ($label for 16bit absolute addresses) Support for Alternative pointer sizes (%label for 32bit relative and &label for 32bit absolute addresses)

Optionally support for !label (8bit relative addressing) and ?label (Architecture specific size/properties) and/or @label1>label2 %label1>label2 displacements may be implemented should the specific architecture require it for human readable hex2 source files (such as ELF headers).

M0

M0 or M0-macro or M1-macro is the minimal string replacement program with string processing functionality required to convert an Assembly like syntax into Hex2 programs that can be compiled. Its rules are merely an extension of Hex2 with the goal of reducing the amount of hex that one would need to write.

The 3 essential pieces are:

  1. DEFINE STRING1 HEX_CHARACTERS (No extra whitespace nor \t or \n inside

definition)

  1. "Raw strings" allow every character except " as there is no support for

string escapes, including NULL; which are converted to Hex chars for Hex2 To convert back to the chars inside of the "quotes" with the addition of a trailing NULL character or the number desired (Must be at least 1, no upper bound) and restrictions such as padding to word boundaries are acceptable.

  1. 'Raw char strings' will be passing anything inside of them (except ' which

terminates the string).

Thus by combining :label, @label, DEFINE SYSCALL 0F05, Raw strings and chars; one has created a rather flexible and powerful Assembler capable of building far more ambitious pieces in "Macro Assembly".

stage2

The stage2 is dependent on the availability of text source files and at least a functional macro assembler and can be used to build operating systems or other "Bootstrap" functionality that might be required to enable functional binaries; such as programs that set execute bits or generate dwarf stubs.

FORTH

Because a great many people stated FORTH would be an ideal bootstrapping language the time and effort was put forth by Caleb and Jeremiah to provide a framework for those people to contribute immediately; thus the FORTH was born.

Several efforts were taken to make the FORTH more standard but ultimately it was determined, Assembly was preferable as the underlying architecture wasn't total garbage.

It now sits waiting for any FORTH programmer who wishes to prove FORTH is a real bootstrapping language.

Lisp

The next recommendation in bootstrapping was Lisp, so efforts were taken to design the most minimal Lisp with all of the functionality described in the original Lisp papers. The task was completed relatively quickly compared to the FORTH and even had enhancements such as a compacting garbage collector.

Ultimately it was found, the lisp that many rave about isn't entirely compatible with modern lisps or schemes; thus was shelved for any Lisper who wishes to pick it up.

C

After being told for months there is no way to write a proper C compiler in assembly and months of research without any human written C compilers in assembly found. To prove the point Jeremiah decided the First C compiler on the bootstrap would actually be a cross-compiler for x86, such that everyone would be able to verify it did exactly what it was supposed to and see it self-host its C version.

stage3

The stage3 is dependent on the availability of text source files and at least a functional M2-Planet level C compiler, FORTH and a Minimal Garbage collecting Lisp and can be used to build more advanced tools that can be used in bootstrapping whole operating systems with modern tool stacks.

initial_library

A library collection of very useful FORTH functionality designed to make the lives of any FORTH programmer easier.

It now sits waiting for any FORTH programmer who wishes to build upon it.

ascension

A library collection of useful Lisp functionality designed to make the lives of any Lisp programmer easier.

As it depends on archaic Lisp dialect; it will likely need to be replaced should the Lisp be properly fixed.

blood-elf_x86

The x86 program for a dwarf stub generator used in mescc-tools bootstrapping. Specifically mescc-tools-seed generation, which can be used to build M2-Planet and thus complete the circle.

get_machine_x86

The trivial x86 program that allows one to skip tests or scripts that will not run on that specific platform or run alternative commands depending upon the architecture.

hex2_linker_x86

The program that allows one to build the hex2 programs for any hardware platform on x86 and thus verify software builds for hardware one does not even have.

M1-macro_x86

The program that allows one to build the M1 program for any hardware platform on x86 and thus verify software builds for hardware one does not even have.

M2-Planet_x86

The x86 port of the M2-Planet C compiler v1.0 used as one of the paths in bootstrapping M2-Planet on x86 hardware.

Inspirations

This work wouldn't have come so far without the inspirational work of others They are in alphabetical order of the Author's last names

GRIMLEY EVANS, Edmund - bcompiler [http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.html] :: The inspiration for hex0, hex1 and hex2 GRIMLEY EVANS, Edmund - cc500 [http://homepage.ntlworld.com/edmund.grimley-evans/cc500] :: The inspiration for M2-Planet Jones, Richard W.M. - jonesforth [http://git.annexia.org/?p=jonesforth.git] :: The inspiration for stage2 FORTH and initial_library Piner, Steve and Deutsch, L. Peter - Expensive Typewriter [http://archive.computerhistory.org/resources/text/DEC/pdp-1/DEC.pdp_1.1972.102650079.pdf] :: The inspiration for SET kragensitaker - The Monitor [https://old.reddit.com/r/programming/comments/9x15g/programming_thought_experiment_stuck_in_a_room/c0ewj2c/] :: The inspiration for the hex0-monitor