Rough notes on ambitious programming language design ideas
2024-03-14
Todo: Think up different specific approaches for each design choice to compare and contrast tradeoffs against
- Differentiable type system with automatic unification and inference of refinement/dependent types (tradeoff: we get really powerful, (relatively) simple, and efficient automatic typechecking at the cost of making our type system stochastic)
- Type unification using FFT, Newton's method?
- Key insight: root finding can be used to check function equivalence
- FFT can be used to decompose functions into simple repeating waves which are much easier to check. It makes the problem much more computationally tractable since we only need to check a single period of a (co)sine function to generalise to the entire domain.
- With fast and automatic function equivalence checking, we can efficiently typecheck refinement and/or dependent types.
- It should be possible to use type inference with algebraic subtyping/simplesub to infer these more complex types.
- Using that we can eliminate common runtime errors and bugs at compile time (ie. range error, null/empty string/NaN error, division by zero, invalid function args, invalid assignment, invalid pointer reference, race conditions) since we know the maximum range of every variable at compile time.
- Sidenote: there was an interesting HN post about finding the difference between two regular expressions, but I can't find it anymore. Could be useful to typecheck strings.
- (Bidirectional) isomorphic abstractions (there is a tradeoff here between isomorphism and zero-cost abstraction since it makes certain zero-cost abstractions impossible). (Unidirectional) homomorphism, ie. some compile to JS languages, have the disadvantage that the abstraction can leak through details of the underlying implementation causing confusion in identifying the origin of behaviour.
- Isomorphism limits the spread of leaky abstractions since all the abstractions map bijectively into each other. Tradeoffs: This relies on the user understanding the underlying system being mapped into and drastically constrains the design space (potentially undesirable)
- Isomorphism between a visual programming language and a textual language could be interesting
- Moonshot idea: this could be enhanced with declarative and data driven programming paradigms
- Mutable value semantics: references exist implicitly when passing structures to functions (tradeoff: improves ergonomics of common use cases at expense of others)
- Stack allocations only: set an unbounded or known maximum bound stack size at runtime using setrlimit and linearise all allocations to the lifetime of their scope at compile time. This prevents memory fragmentation but is worse in data/tlb cache efficiency (huge pages might help though). Or we could just heap allocate in stack lifetimes like std::unique_pointer. Either way, you still need some escape hatch for programs that can't avoid shared memory. Tradeoff: Improves allocation ergonomics and safety at the cost of dramatically constraining design space and limiting space of possible programs without using an escape hatch (potentially undesirable)
- All memory is automatically allocated at freed at function scope like normal stack variables.
- Variable sized data structures (ie. dynamically sized array) can be made by allocating a large block of memory upfront (eg. 4gb). Stack allocations are demand paged by the OS so only the occupied pages are resident in physical memory.
- Prevent dangling pointers by only allowing references to point downward on the stack
- This prevents most classes of memory unsafety: use after free, double free, null pointer dereference, stack overflow, uninitialised pointers (with bounds checking you can guarantee memory safety)
- Multithreading can be unified into this stack-centric paradigm by restricting the lifetime of a thread to a function scope like a stack variable, then applying the same downward reference rule by only allowing the thread-local stack to reference downward from the function scope on the original stack.
- This prevents multithreaded memory corruption
- A thread spawned from a thread (child thread) can access both the other thread's thread local storage below the function scope point as well as the original stack below the parent thread's function scope point
- Thread local storage cannot be referenced by a non-child/descendent thread without guaranteeing that the lifetime of the referenced stack point equals or exceeds the lifetime of the reference
- Tradeoff: We get memory safe and easy low level code but at the cost of certain structural limitations on the program in terms of allocation and concurrency. We still need an escape hatch for certain classes of programs.
- Vale's universal function syntax is quite interesting. Tradeoff: Slightly improved ergonomics at the cost of larger programming language feature scope (bloat?)
- Trailing list commas are good for diffs. Tradeoff: same as above.
- Package mangers considered harmful
- One Approach: Tool that automatically packages libraries for all of the different system package managers instead of creating a new package manager: security against supply chain attacks, less reliance on fragile package ecosystem that can break at any time, implicitly discourages writing superfluous packages by adding a barrier to entry (a package manager being TOO good is an issue, ie. npm; tradeoff: improved security, resiliency and quality filtering but worse ergonomics, platform portabilitu and externalises complexity onto other projects)
- Another approach: Develop tools that allow developers to scalably understand all code being pulled in through a library. This is much harder. wip
- Hermiticity: hermetic binaries and/or hermetic builds (see cosmopolitan libc); Tradeoff: improved resiliency, stability and reproducability at the cost of increased complexity and potentially worse security
- Strive to a create a programming language that gives a good experience for users of downstream software, not just the developers using the language (think GPL vs MIT); Tradeoff: improving users lives can make developers' lives worse
- Conway's law: what can we do about it? (wip)