summaryrefslogtreecommitdiff
path: root/content/b
diff options
context:
space:
mode:
authorlemon <lsof@mailbox.org>2022-03-03 10:11:01 +0100
committerlemon <lsof@mailbox.org>2022-03-03 10:11:01 +0100
commit82624418d5940ef628a51ca07aa222073fa9a388 (patch)
tree247c542ea5cf952889bd394d1bb3e3a32ee3932f /content/b
parent12ddb91334e82c87f6318beea879eb53f1fd6f1c (diff)
change some things around
Diffstat (limited to 'content/b')
-rw-r--r--content/b/_index.md6
-rw-r--r--content/b/c99-vla-tricks.md308
-rw-r--r--content/b/first.md15
3 files changed, 0 insertions, 329 deletions
diff --git a/content/b/_index.md b/content/b/_index.md
deleted file mode 100644
index c504e1b..0000000
--- a/content/b/_index.md
+++ /dev/null
@@ -1,6 +0,0 @@
-+++
-title = "Blog"
-sort_by = "date"
-template = "blog.html"
-page_template = "blog-page.html"
-+++
diff --git a/content/b/c99-vla-tricks.md b/content/b/c99-vla-tricks.md
deleted file mode 100644
index b2b07eb..0000000
--- a/content/b/c99-vla-tricks.md
+++ /dev/null
@@ -1,308 +0,0 @@
----
-title: "C99 doesn't need function bodies, or 'VLAs are Turing complete'"
-date: 2022-02-19
----
-
-Preface
-=======
-
-The 1999 revision to the C programming language brought several interesting
-changes and additions to the standard. Among those are variable length arrays,
-or VLAs for short, which allow array types to have lengths that are determined
-at runtime. These can show up in different contexts, but for the purposes of
-this post we will be looking at VLAs as parameters to functions. For example:
-
- void f(int n, float v[n]);
-
-Since array parameters decay to their corresponding pointer type, that
-declaration is functionally equivalent, and compatible ([6.7.5.3p7]) to the
-following:
-
- void f(int n, float *v);
-
-The length of `v` in the first declaration is given by the value of a previous
-parameter of the function, which can only be known at runtime. However, in a
-function prototype (i.e. not a function definition) like the above, the length
-of a VLA parameter is never computed and essentially ignored ([6.7.5.2p5]).
-But in a function definition, the length is evaluated and must be greater than
-zero. In practice this is rather useless because the sizeof operator doesn't
-work as one might expect it to with array parameters ([6.5.3.4-note88]):
-
- void f(int n, float param[n]) {
- float local[n];
- int a = sizeof local,
- b = sizeof param;
-
- printf("n = %d\n", n);
- printf("sizeof local = %d\n", a); // this will be n * sizeof(float)
- printf("sizeof param = %d\n", b); // but this will be sizeof(float *)
- }
-
-In other words, using sizeof with a VLA will evaluate the runtime-known length
-and calculate the array size based on that, except when that VLA is a function
-parameter, then, for whatever reason, it is the size of the corresponding
-pointer type (in this example, `sizeof local === n * sizeof(float)` but `sizeof
-param == sizeof(float *)`). The length of a VLA parameter may be used when e.g.
-computing indices when accessing multi-dimensional variable length arrays.
-
-Alas, the standard mandates that the variable array length be computed when
-the function is called. Of course, the expression in between the square
-brackets is not limited to simple expressions like `n`, so one can write
-something like:
-
- void f(int a, int b, int m[(a + b) / 2]) {}
-
-or
-
- void f(int x, int b[abs(x)]) {}
-
-or even
-
- void f(int v[getchar()]) {}
-
-
-[6.7.5.3p7]: http://port70.net/~nsz/c/c99/n1256.html#6.7.5.3p7
-[6.7.5.2p5]: http://port70.net/~nsz/c/c99/n1256.html#6.7.5.2p5
-[6.5.3.4-note88]: http://port70.net/~nsz/c/c99/n1256.html#note88
-
-
-Disembodiment
-=============
-
-The following program should give you an idea of the kind of constructs
-that these rules allow for ([try it on compiler explorer][a]):
-
- int main(int argc, char *argv[printf("Hello")])
- {
- printf(" world!\n");
- }
-
-The length expression of the VLA is evaluated before any of the statements in
-`main`. I couldn't find anywhere in the standard saying whether this evaluation
-order is well-defined but it is what clang and gcc do, and luckily, it does not
-matter for the sake of this article, as we will see shortly.
-
-Let us refer to the subset of C99 where function bodies must be empty as
-*disembodied C*. You might naturally ask yourself what things can be
-accomplished in this subset (though you can probably guess the answer from the
-title of this page). In other words, what can you do in C if you are limited to
-just evaluating expressions and no statements?
-
-- Using the comma operator, we can sequence arbitrarily many expressions for
- their side effects, so
-
- void f() {
- printf("Press enter to confirm: ");
- getchar();
- printf("thanks.\n");
- }
-
- becomes
-
- void f(char _[(
- printf("Press enter to confirm: "),
- getchar(),
- printf("thanks.\n"),
- 1
- )]) {}
-
- In a comma expression, the operands are evaluated left-to-right and the value
- of the last operand is the resulting value of the whole expression. The `1`
- at the end ensures that the evaluated array length is >0 (to avoid UB).
-
-- Functions in disembodied C are going to need a dummy parameter where the VLA
- length expression evaluation can take place. For consistency, we will denote
- it as `char _[...]` and give it the value `""` (the empty string) when calling
- said functions (note that the value we give it doesn't actually matter, though
- its size should be at least as big as the computed VLA size to avoid UB).
-
-- If-else statements can be replaced with ternary conditional expressions, such that
-
- void f(int n) {
- if (n < 0)
- printf("negative!");
- else if (n > 0)
- printf("positive!");
- else
- printf("zero!");
- }
-
- becomes
-
- void f(int n, char _[(
- (n < 0) ?
- printf("negative!")
- : (n > 0) ?
- printf("positive!")
- :
- printf("zero!")
- , 1
- )]) {}
-
-- Remember that the VLA length expression can access previous function arguments,
- so parameter passing is essentially unchanged.
-
-- We cannot return values, but we can use out parameters by taking advantage of
- the fact that assignments are expressions in C, so instead of
-
- int imax(int a, int b) {
- return a > b ? a : b;
- }
-
- we can write
-
- void imax(int *out, int a, int b, char _[
- (*out = a > b ? a : b),
- 1
- ]) {}
-
-- We cannot define local variables inside of expressions, but we can just add
- extra function parameters to use as temporaries, rewriting
-
- void fswapf(float *a, float *b) {
- float tmp = *a;
- *a = *b;
- *b = tmp;
- }
-
- as
-
- static void fswapf_aux(float *a, float *b, float tmp, char _[(
- tmp = *a,
- *a = *b,
- *b = tmp,
- 1
- )]) {}
-
- void fswapf(float *a, float *b, char _[(
- fswapf_aux(a, b, 0, ""), 1
- )]) {}
-
- Alternatively, if re-entrancy and thread-safety are disregarded, we could
- just use global (static) variables.
-
-- What about loops? Clearly we cannot use `while` or `for` inside expressions.
- Thankfully, they are unnecessary thanks to recursion. For example:
-
- float sum(float *v, size_t n) {
- float sum = 0.0f;
- for (size_t i = 0; i < n; ++i)
- sum += v[i];
- return sum;
- }
-
- can be expressed as
-
- /* the forward declaration is necessary */
- static void sum_aux(float *out, float *v, size_t n, char *);
- static void sum_aux(float *out, float *v, size_t n, char _[(
- (n > 0) ? (
- *out += *v,
- sum_aux(out, v + 1, n - 1, ""),
- 1
- ) : 1
- )]) {}
-
- void sum(float *out, float *v, size_t n, char _[(
- *out = 0.0f,
- sum_aux(out, v, n, ""),
- 1
- )]) {}
-
- In fact, any imperative-style loop can be turned into an equivalent recursive
- loop (as any functional programmer will be happy to demonstrate), though since
- C lacks closures or any form of anonymous functions, it can get quite unwieldy
- (I hope you like auxiliary functions).
-
- The astute reader might point out that these two versions of `sum` are not
- equivalent because the recursive definition may cause a stack overflow for
- large enough values of `n`. This is unfortunately true, and the major hurdle
- for the practicality of disembodied C, but does not preclude Turing
- completeness (an ideal [Turing machine] has infinite memory at its
- disposal). Luckily, modern compilers are smart, and if we write our functions
- carefully to be [tail recursive], they will often be able to perform tail
- call elimination, removing the risk of a stack overflow. For the above
- example, both gcc and clang are able to optimize the tail call ([see on
- compiler explorer][b]).
-
-- Although not relevant to standard C99, in case you had the idea of using [gcc
- statement expressions] to bypass these limitations, the compiler will stop
- you right on your tracks since it doesn't allow those outside of function
- bodies.
-
-
-[Turing machine]: https://en.wikipedia.org/wiki/Turing_machine
-[tail recursive]: https://en.wikipedia.org/wiki/Tail_call
-[gcc statement expressions]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
-[a]: https://godbolt.org/z/f3chYMvh5
-[b]: https://godbolt.org/z/b1aG78os5
-
-
-Limitations
-===========
-
-After all of that, it seems that basically anything can be done in disembodied
-C, so, is there something that *can't* be expressed in this C99 subset?
-
-With the rules that we've laid down so far, yes. In particular:
-
-- It is not possible in the general case to use APIs that require callbacks.
- For example, look at the standard `qsort` function:
-
- void qsort(void *base, size_t nmemb, size_t size,
- int (*compar)(const void *, const void *));
-
- The `compar` parameter is a pointer to a function that should return the
- result of comparing two values in the given array. Since its parameters are
- void pointers, we can't have the VLA argument we need to perform computations
- (arrays of `void` are not valid C), and we also cannot provide an `int`
- return value. Thus there is no way (that I know of) to use this function in
- standards-compliant disembodied C.
-
- This is not a dealbreaker since we could just reimplement `qsort` ;).
-
-- We cannot access the program arguments. The entry point of a program in
- disembodied C looks like this:
-
- int main(int argc, char *argv[ /* code here ... */ ]) {}
-
- In the VLA length expression for `argv`, `argc` is in scope, but `argv` is
- not.
-
- Fortunately there is a way to work around this: we can use the common
- extension where `main` can take a third argument which is a pointer to the
- environment. According to the standard ([J.5.1]), this is implemented in
- hosted environments. In practice, POSIX environments and Microsoft both
- support it. Then, our entry point looks like this instead:
-
- int main(int argc, char **argv, char *envp[/* ... */]) {}
-
- Now `argv` is in scope within the `envp` array length expression.
-
- Side note: even though we can't `return`, `main` is the exception to the rule
- that reaching the closing `}` of a function returning non-void is verboten,
- and is equivalent to returning zero ([5.1.2.2.3]). If we need to terminate
- with an exit code other than zero, we can use `exit()`.
-
-
-[J.5.1]: http://port70.net/~nsz/c/c99/n1256.html#J.5.1
-[5.1.2.2.3]: http://port70.net/~nsz/c/c99/n1256.html#5.1.2.2.3
-
-
-Final Words
-===========
-
-It seems that with enough effort anything can be done with these peculiar
-restrictions. As an example, I wrote implementations of the [MD5] and [SHA256]
-hash functions as well as [Conway's Game of Life using SDL for graphics][life].
-I was also working on a more interesting and "useful" program but I kind of
-lost motivation halfway through. I'll update this article if I ever come back
-to finish that.
-
-In case it wasn't clear enough, there is zero practical reason to ever do any
-of this. I hope you'll agree it's a fun party trick though :).
-
-
-[MD5]: https://git.sr.ht/~lsof/x/tree/main/item/vla/md5sum.c
-[SHA256]: https://git.sr.ht/~lsof/x/tree/main/item/vla/sha256sum.c
-[life]: https://git.sr.ht/~lsof/x/tree/main/item/vla/life.c
diff --git a/content/b/first.md b/content/b/first.md
deleted file mode 100644
index 291b364..0000000
--- a/content/b/first.md
+++ /dev/null
@@ -1,15 +0,0 @@
-+++
-title = "test post"
-date = 2022-02-02
-+++
-
-# test
-lorem ipsum dolor sit amet
-
- u16 rng(u16 s){
- s^=s<<8,s=(s<<8
- |s>>8)^((s&0xff
- )<<1),s=0x7e00^
- (s>>1)^(0x9e74&
- -(~s&1));return
- 46497^s?s:s^s;}