adding more stuff to the blog

Signed-off-by: Jakub Doka <jakub.doka2@gmail.com>
This commit is contained in:
Jakub Doka 2024-12-20 22:54:46 +01:00
parent 4b3b6af70e
commit 1621d93e86
No known key found for this signature in database
GPG key ID: C6E9A89936B8C143

View file

@ -28,4 +28,34 @@ It took around 4 months to reimplement everything make make the optimal code loo
## How my understanding of optimizations changed
I need to admit, before writing a single pass compiler and later upgrading it to optimizing one, I took optimizations as some magic that makes code faster and honestly believed they are optional and most of the hard work is done in the process of translating readable text to the machine code. That is almost true with minus the readable part. If you want the code you write to perform well, with a compiler that translates your code from text to instructions as its written, you will be forced to do everything modern optimizers do, by hand in your code. TODO...
### Optimizations allow us to scale software
I need to admit, before writing a single pass compiler and later upgrading it to optimizing one, I thought optimizations only affect the quality of final assembly emitted by the compiler. It never occur to me that what the optimizations actually do, is reduce the impact of how you decide to write the code. In a single pass compiler (with zero optimizations), the machine code reflects:
- order of operations as written in code
- whether the value was stored in intermediate locations
- exact structure of the control flow and at which point the operations are placed
- how many times is something recomputed
- operations that only help to convey intent for the reader of the source code
- and more I can't think of...
If you took some code you wrote and then modified it to obfuscate these aspects (in reference to the original code), you would to a subset of what optimizing compiler does. Of course, a good compiler would try hard to improve the metrics its optimizing for, it would:
- reorder operations to allow the CPU to parallelize them
- remove needless stores, or store values directly to places you cant express in code
- pull operations out of the loops and into the branches (if it can)
- find all common sub-expressions and compute them only once
- fold constants as much as possible and use obscure tricks to replace slow instructions if any of the operands are constant
- and more...
In the end, compiler optimizations try to reduce correlation between how the code happens to be written and how well it performs, which is extremely important when you want humans to read the code.
### Optimizing compilers know more then you
Optimizing code is a search problem, an optimizer searches the code for patterns that can be rewritten so something more practical for the computer, while preserving the observable behavior of the program. This means it needs enough context about the code to not make a mistake. In fact, the optimizer has so much context, it is able to determine your code is useless. But wait, didn't you write the code because you needed it to do something? Maybe your intention was to break out of the loop after you are done, but the optimizer looked at the code and said, "great, we are so lucky that this integer is always small enough to miss this check by one, DELETE", and then he goes "jackpot, since this loop is now infinite, we don't need this code after it, DELETE". Notice that the optimizer is eager to delete dead code, it did not ask you "Brah, why did you place all your code after an infinite loop?". This is just an example, there are many more cases where modern optimizers just delete all your code because they proven it does something invalid without running it.
Its stupid but its the world we live in, optimizers are usually a black box you import and feed it the code in a format they understand, they then proceed to optimize it, and if they find a glaring bug they wont tell you, god forbid, they will just molest the code in unspecified ways and spit out whats left. Before writing an optimizer, I did no know this can happen and I did not know this is a problem I pay for with my time, spent figuring out why noting is happening when I run the program.
But wait its worse! Since optimizers wont ever share the fact you are stupid, we end up with other people painstakingly writing complex linters, that will do a shitty job detecting things that matter, and instead whine about style and other bullcrap (and they suck even at that). If the people who write linters and people who write optimizers swapped the roles, I would be ranting about optimizers instead.
And so, this is the area where I want to innovate, lets report the dead code to the frontend, and let the compiler frontend filter out the noise and show relevant information in the diagnostics. Refuse to compile the program if you `i /= 0`. Refuse to compile if you `arr[arr.len]`. This is the level of stupid optimizer sees, once it normalizes your code, but proceeds to protect your feeling. And hblang will relay this to you as much as possible. If we can query for optimizations, we can query for bugs too.