diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..8843d37 --- /dev/null +++ b/Makefile @@ -0,0 +1,25 @@ +src := content +cache := .cache +www := _site/posts + +post_compiler := src/webcc.ml +builddir := _build/default + +POSTCC := $(builddir)/$(patsubst %.ml,%.exe,$(post_compiler)) + +post-sources := $(shell find $(src)/ -type f) +post-htmls := $(patsubst $(src)/%.md,$(www)/%.html,$(post-sources)) + +all: $(POSTCC) $(post-htmls) + +$(www): + mkdir -p $(www) + +$(www)/%.html: $(src)/%.md $(www) + $(POSTCC) $< -o $@ + +$(POSTCC): $(post_compiler) + dune build + +clean: + rm $(POSTCC) diff --git a/content/adafruit-lcd.md b/content/adafruit-lcd.md new file mode 100644 index 0000000..4b69ffb --- /dev/null +++ b/content/adafruit-lcd.md @@ -0,0 +1,161 @@ +title: HD44780 LCD with OCaml +tags: electronics, ocaml +date: 2019-08-28 00:00 +--- + +Some time ago I've purchased an [Adafruit HD44780 +LCD](https://www.adafruit.com/product/181) as a fancy accessory for +playing around with Raspberry Pi. It was lying around in my office for +some time until recently. I finally got around to putting it together +(many thanks to my office mate Carlo for helping me solder and wire the +thing). + +It came with a [Python +library](https://github.com/adafruit/Adafruit_CircuitPython_CharLCD) +for controlling it via a high-level API. However, I don't really like +Python, and I really don't like the design of CircuitPython. So I +decided to write an interface to the LCD in OCaml, that I originally +wanted to use on a Raspberry Pi (using the bindings to the WiringPi +library). + +The code can be found in [my fork of the ocaml-wiringpi repository](https://github.com/co-dan/ocaml-wiringpi/tree/lcd_lwt/examples) under the `lcd_lwt` branch. +It mainly consists of two parts: +the low-level interface for writing data to the display and invoking +some operations, and a high-level interface powered by a "cursor" +module. + +## The basics + +The first is a low-level API that controls the display using 4 data +pins (D4-D7) and two special "control" pins. The first control pin is EN +("enable pin") which is used to signal the start of data read/write +operations. The RS ("register select pin") is used as a flag for +selecting which registers to address. The low value denotes the +instruction register, and the high value denotes the data register. + +For example, to clear the display (instruction 0x01 = 0b00000001) one would send the following signals +to the LCD: + +1. send 0 to RS +2. send 0, 0, 0, 0 to D4, D5, D6, D7, resp -- the four upper bits in reverse order +3. send 0, 1, 0 to EN -- signal that the first four bits have been sent +4. send 1, 0, 0, 0 to D4, D5, D6, D7, resp -- the four lower bits in reverse order +3. send 0, 1, 0 to EN -- signal that the last four bits have been sent + +Internally, this sequence of signals stores 0b00000001 in the instruction register IR. + +Instead of sending the instructions to the LCD, we can instead send +data, which ends up in the display data RAM (DDRAM). Suppose we want +to display a character 'a' (ascii 97 = 0b1100001). We then do the +following sequence of operations: + +1. send 1 to RS -- address the data register +2. send 0, 0, 1, 1 to D4, D5, D6, D7, resp -- the four upper bits in reverse order +3. send 0, 1, 0 to EN -- signal that the first four bits have been sent +4. send 0, 0, 0, 1 to D4, D5, D6, D7, resp -- the four lower bits in reverse order +3. send 0, 1, 0 to EN -- signal that the last four bits have been sent + +We implement this operation of writing 8 bits to either the +instruction register or the data register as the following function: + +```ocaml +write8 : Lcd.t -> ?char_mode:bool -> char -> unit +``` + +where `char_mode` determines whether we write the data to the data +register (`true`) or to the instruction register (`false`). + +To write out a string to be displayed we can use `Bytes.iter`: + +```ocaml +let write_bytes lcd bts = + Bytes.iter (fun c -> write8 lcd c ~char_mode:true) bts +``` + +The location in which we store the data is determined by the address +counter, which is automatically incremented after each write. This +means that we can just send a sequence of characters sequentially, +without touching the address counter ourselves. + +## Operations with arguments + +The way that operations with arguments are typically represented is by +using e.g. 4 bits to denote the operation and 4 bits to denote the +arguments. For instance, to activate the 2 line mode, we invoke the +"function set" operation which requires the upper 4 bits to be `0010`, +with the argument "2 line mode" which requires the lower 4 bits to be +`1000`. To obtain the code for the whole operation we just take the +bitwise OR: + +``` +0b0000 1000 -- 2 line mode +0b0010 0000 -- function set +___________ +0b0010 1000 +``` + +This translates to the following mini program in OCaml: + +``` +let _lcd_2line = (0x08) in +let _lcd_functionset = (0x20) in +write8_unsafe lcd (_lcd_2line lor _lcd_functionset); +``` + +Let us a consider a simple example: shifting the characters. +By default, the data that is actually displayed on the LCD is taken +from the addresses 0x0..0x7 and 0x40..0x47 for the first and the +second rows, resp. If we want to display further characters we can use +this shift operation. (See FIGURE 5 in the docs). + +To do this we invoke the "cursor/display shift" operation settin the +appropriate bits for moving the display and the move direction: + +``` +let _lcd_cursorshift = (0x10) +(* 00010000 *) +let _lcd_displaymove = (0x08) +(* 00001000 *) +let _lcd_moveright = (0x04) +(* 00000100 *) +(* -------- *) +(* 00011100 *) + +let shift_right lcd = + write8_unsafe lcd (_lcd_cursorshift lor _lcd_displaymove lor _lcd_moveright) +``` + +## Higher-level interface + +We can provide a slightly higher-level interface by keeping track of the cursor (and some other settings) in the program. + +```ocaml + type Cursor.t = { x: int; y: int; visible: bool; blink: bool; _lcd: mono_lcd } +``` + +A /cursor/ is a record that combines the underlying `mono_lcd` type (which stores the pin layout), the current position of the cursor `x, y` and some settings (whether the cursor should be visible and blinking). +Then all the operation on the cursor are just functions `Cursor.t -> Cursor.t`. +For example, a function that write out a string takes in a cursor, writes the underlying bytes using `write_bytes` and updates the cursor position. +A function that sets the blinking flag writes out the desired bytes (as descirbed in the "operations with arguments") and updates the boolean flag. +Combination of those operations are just function composition: + +```ocaml +let (|>) (m : Cursor.t) (f : Cursor.t -> 'b) = f m +let display_lines lcd l1 l2 = + let col_shift = 3 in + clear lcd; + let cur = Cursor.of_lcd lcd in + let open Cursor in + cur + |> set_visible false + |> set_blink false + |> set_position col_shift 0 + |> write_string l1 ~wrap:false + |> set_position col_shift 1 + |> write_string l2 ~wrap:false +``` + +## Concluding + +You can find more usage examples in the [lcd_lwt.ml file](https://github.com/co-dan/ocaml-wiringpi/blob/lcd_lwt/examples/lcd_lwt.ml). +Overall, I thought that OCaml was a good fit for this kind of programming, and the type system helps out a bit! diff --git a/content/alg-theories-examples.md b/content/alg-theories-examples.md new file mode 100644 index 0000000..d00e252 --- /dev/null +++ b/content/alg-theories-examples.md @@ -0,0 +1,76 @@ +title: Counterexamples of algebraic theories +tags: category theory, algebra +date: 2020-01-21 00:00 +--- + +An *algebraic theory* $T$ is given by a collection of operations with +given arities, and a number of equations of the form $\Gamma \mid t = +s$ between terms of $T$. Here $t$ and $s$ can contain free variables +from $\Gamma$. A model $M$ for an algebraic theory $T$ is an +underlying set (call it $M$ as well) which interprets all the +operations, and for which all the equations of $T$ hold. A morphism of +$T$-models is a function of the underlying sets that presevres the +interpretation of all the operations. + +The category $Mod_T$ of $T$-models comes with a forgetful functor $U : +Mod_T \to Set$. It has a left adjoint $F : Set \to Mod_T$ which +constructs a free $T$-model on the underlying set. + +The category $Mod_T$ and the functor $U$ always satisfies a number of +properties that can be used for showing that certain theories are not +algebraic, i.e. that certain categories do not arise as models of +algebraic theories. + +## Conservativity of the forgetful functor + +One of the properties of $U$ is that it reflects isomorphisms: if $f : +M \to M'$ is a morphism of $T$-models, and $f$ has a set-theoretic +inverse $g$ (i.e. $g : M' \to M$ such that $U(f) \circ g = 1$ and $g +\circ U(f) = 1$), then $f$ is an isomorphism of models. + +This can be easily observed for monoids: let $f$ be a group +homomorphism and let $g$ be it's set theoretic inverse. Then $g(ab) = +g(f(g(a)) f(g(b))) = g (f (g(a)g(b))) = g(a)g(b)$. + +**Posets**. A discrete two-element poset $X = \\{0,1\\}$ can be embedded +into the poset $Y = \\{0, 1\\}$ with $0 \leq_Y 1$. Moreover, on a +set-theoretic level, this injection is just an identity function and +hence is an isomorphism. If the theory of posets were algebraic, then +$X$ and $Y$ were isomorphic as posets, which is not the case. + +## Lattice of submodels + +Up to isomorphism, a subobject of a $T$-model $M$ is a $T$-model $M'$ +such that $M' \subseteq M$ and the interpretation of all the +operations on $M'$ coincides with the interpretations in $M$. + +Like in many categories, subobjects for a given model form a lattice. +The meet of submodels is just a set-theoretic intersection (can be +computed from the pullback of one submodel along the other). However, +the join of two submodels may not coincide with the set-theoretic +union: if we have two submodels $M_1, M_2$, and a binary operation $f$ +in the theory, then the set-theoretic union $M_1 \cup M_2$ will not +contain $\bar{f}(x, y)$ for $x \in M_1, y \in M_2$ and $\bar{f}$ the +interpretation of $f$ in $M$. + +Given a set $X \subseteq M$ we define the *closure of $X$* to be the +smallest submodel $\bar{X} \subseteq M$ that contains $X$. Explicitly, +$$ +\bar{X} = \\{ \bar{f}(x_1, \dots, x_n) \in M \mid x_1, \dots, x_n \in X, f \in T \\} +$$ +where $\bar{f}$ is a $M$-interpretation of an $n$-ary operation $f$ from $T$. +$\bar{X}$ behaves like the closure operation and satisfies all the expected properties. + +Using the closure operation we can define the join of two submodels $M_1, M_2 \subseteq M$ as +$$M_1 \vee M_2 = \overline{M_1 \cup M_2}$$. +This can be generalized to unions over arbitrary families $\bigvee_{i \in \mathcal{I}} M_i$. + +Finally, note the following: if $M_i, i \in \mathcal{I}$ is a directed family, then $\bigvee_{i \in \mathcal{I}} M_i = \bigcup_{i \in \mathcal{I}} M_i$. +If at some point we add $\bar{f}(a, b)$ to the join, with $a \in M_a, b \in M_b$, then there is some $M_c$ that includes both $M_a$ and $M_b$; this submodel $M_c$ already contains $\bar{f}(a, b)$. + +Now we are ready to give another counterexample. + +**Complete lattices**. The theory of complete lattices in not algebraic. +Consider the completion $\omega+1$ of $\omega$ with the top element $\infty$. +Consider a directed family $D = \\{ \\{ 0, \dots, n \\} \mid n \in \mathbb{N} \\}$. +The set-theoretic union $\bigcup D$ is just the natural numbers (not a complete lattice), but the join $\bigvee D$ is $\omega+1$. diff --git a/content/boolean-cat.md b/content/boolean-cat.md new file mode 100644 index 0000000..3421dca --- /dev/null +++ b/content/boolean-cat.md @@ -0,0 +1,67 @@ +title: Uniquness of the absurdity proofs in Cartesian closed categories +tags: category theory +date: 2015-08-12 00:00 +--- + +Adding the DNE to CCCs collapses them into Boolean algebras: +Suppose $\mathcal{C}$ is a Cartesian closed category. +We can show that if $\mathcal{C}$ has a family of isomorphisms between $A$ and $(A \to 0) \to 0$, then $\mathcal{C}$ is a poset (and a Boolean algebra). + +**Proposition**. +For any object $A$ of $\mathcal{C}$, $A \times 0 \simeq 0$. + +*Proof*. + ${\mathit{\mathop{Hom}}}(A \times 0, B) \simeq {\mathit{\mathop{Hom}}}(0 \times A, B) \simeq {\mathit{\mathop{Hom}}}(0, B^A) + \simeq 1$. + Therefore, $A \times 0$ is an initial object in $\mathcal{C}$. + Clearly $(\pi_2, \bot_{A \times 0})$ establishes the isomorphism. + +*Qed* + +**Proposition**. + Given a morphism $f : A \to 0$, we can conclude that $A \simeq 0$. + +*Proof*. + The isomorphism is witnessed by $(f, \bot_A) : A \to A \times 0 \simeq 0$ (where + $\bot_A : 0 \to A$ is the unique initial morphism). + +*Qed* + +We have seen that given the morphism $f : A \to 0$ we can construct an +iso $A \simeq 0$. In particular, that means that there can be *at + most one arrow* between $A$ and $0$; interpreting arrows as +"proofs" or "proof object", that means that in the intuitionistic +calculus there is at most one proof/realizer of the negation (up to +isomorphism). + +Now, returning to the original question, how do CCCs with +double-negation elimination look like? Specifically, we want to show +that if in $\mathcal{C}$ there is an isomorphism between $A$ and +$0^{(0^A)}$ for all objects $A$, then the categorical structure +collapses into a Boolean algebra. + +We say that a category has *double negation elimination* if for every $A$, it is the case that $A \simeq 0^{(0^A)}$. + +**Proposition**. + Given a category with double negation elimination and two morphisms $f, g : A \to B$, it + is the case that $f = g$. + +*Proof*. + Let $i$ be the isomorphism between $B$ and $0^{(0^B)}$. Then, + $i \circ f, i \circ g \in {\mathit{\mathop{Hom}}}(A, 0^{(0^B)}) \simeq {\mathit{\mathop{Hom}}}(A \times + (0^B), 0)$. + But the latter set can contain at most one element; thus, + $i \circ f = i \circ g$. Because $i$ is an isomorphism, $f = g$. + +*Qed* + +It might be worth noting that in "standard" cartesian closed +categories, a morphism $i : B \to 0^{(0^B)}$ serve as a coequalizer of +$f$ and $g$. Perhaps stretching the analogy a bit, we can view $i$ as +"equalizes" two constructive proofs of $B$ into a single classical +proof. + + +**See also** + +- [A categorical characterization of Boolean algebras](http://link.springer.com/article/10.1007%2FBF02485724) by M.E. Szabo diff --git a/content/delim.md b/content/delim.md new file mode 100644 index 0000000..0daca88 --- /dev/null +++ b/content/delim.md @@ -0,0 +1,263 @@ +title: Delimited continuations +date: 2018-12--2 00:00 +tags: programming languages, scheme +--- + +A continuation can be understood as a context that represents the +“rest of the program”. Notice that formulated in this way, +continuation is a meta-level abstraction. Ability to manipulate such +continuations as functions is obtained using *control operators*. The +Scheme programming language has traditionally been a ripe field for +research on control operators such as `(call/cc)`. + +Call-with-current-continuation is an example of an unbounded control +operator. A different and a more general class of control operators +are *delimited control operators*. The basic idea is that rather to +capture the whole rest of the program as a continuation, delimited +control operators capture delimited continuations, i.e. some part of +the program with a hole. + +In order to run the examples from this document in [DrRacket](http://racket-lang.org) make sure to put +`(require racket/control)` in your file. + + +## Prompt/control + +One of the first (and simplest) delimited control operations is +prompt/control. `prompt` delimits the continuation, and `control` +captures the current continuation (up to the innermost `prompt`). + + (prompt (+ 1 (control k (k 3)))) + +The code is evaluated as follows: + + (prompt (+ 1 (control k (k 3)))) + => (prompt ((λ (k) (k 3)) (λ (x) (+ 1 x)))) + => (prompt ((λ (x) (+ 1 x)) 3)) + => (prompt (+ 1 3)) => (prompt 4) => 4 + +Here `(λ (x) (+ 1 x))` represents the delimited continuation (the +program inside `prompt` without the `control` part) on an object +level. When `(control k body)` is evaluated inside a `prompt`, it gets +replaced with `(λ (k) body)` (capturing `k` inside `body`) and applied +to the continuation. In terms of reduction semantics one would have +the [following reduction rules](https://docs.racket-lang.org/reference/cont.html#%28form._%28%28lib._racket%2Fcontrol..rkt%29._prompt%29%29): + + (prompt v) -> v + (prompt E[(control k body)]) -> (prompt ((λ (k) body) (λ (x) E[x]))) + +where the evaluation context E does not contain `prompt` and `x` is a +free variable in E. + +In particular, if continuation is not invoked in `body`, it is just +discarded, as in the following example: + + (prompt (+ 1 (control k 3))) + +A captured delimited continuation can be invoked multiple times: + + (prompt + (+ 1 (control k (let ([x (k 1)] [y (k 2)]) + (k (* x y)))))) + +In this case the continuation `k = (λ (x) (+ 1 x))`, so when it is +invoked in the body of `control`, x and y get set to 2 and 3, +respectively. Hence, the final result of that program is 1+2\*3=7. + +Nested continuations + + (prompt + (+ 2 (control k (+ 1 (control k1 (k1 6)))))) + => (prompt + ((λ (x) (+ 2 (control k (+ 1 x)))) 6)) + => (prompt + (+ 2 (control k (+ 1 6)))) + => (prompt + (+ 2 (control k 7))) + => (prompt 7) => 7 + + +## While loops with breaks + +We can utilize the prompt/control operators to implement a simple +macro for a while loop with an ability to break out of it, sort of +similar to the while/break statements in C. + + (define-syntax-rule (break) (control k '())) + (define-syntax-rule (while cond body ...) + (prompt + (let loop () + (when cond + body ... + (loop))))) + +We define the `(while)` construct as a simple recursive loop with just +one caveat – we wrap the whole thing in a `prompt`. Then `(break)`, +when evaluated, just discards the whole continuation with is bound to +the `(while)` loop. We can test this macro by writing a procedure that +multiplies all the elements in the list. + + (define (multl lst) + (define i 0) + (define fin (length lst)) + (define res 1) + (while (< i fin) + (let ([val (list-ref lst i)]) + (printf "The value of lst[~a] is ~a\n" i val) + (set! res (* res val)) + (when (= val 0) + (begin (set! res 0) (break)))) + (set! i (+ i 1))) + res) + +By running this program we can observe that it finishes early whenever +it encounters a zero in the list: + + > (multl '(1 2 3)) + The value of lst[0] is 1 + The value of lst[1] is 2 + The value of lst[2] is 3 + 6 + > (multl '(1 2 0 3 4 5)) + The value of lst[0] is 1 + The value of lst[1] is 2 + The value of lst[2] is 0 + 0 + > + + +*The situation with `break` in C is slightly different; as it has +been pointed out to me, break is a *statement*, whereas in our toy +example `break` is an expression. Due to the dichotomy of statements +and expressions in C this example is not very faithful to C semantics.* + +### Prompt tags + +The `(while)` macro that we have is actually buggy. The problem arises +when the body of the while loop contains an additional prompt +delimiter: + + (while (< i 3) (writeln "Hi") (prompt (break)) (set! i (+ i 1))) + +When running this example “Hi” is written to the standard output three +time. Oops. To circumvent this problem we can use the [tagged +prompt-at/control-at operators](https://docs.racket-lang.org/reference/cont.html#%28form._%28%28lib._racket%2Fcontrol..rkt%29._prompt-at%29%29) with the following reduction semantics: + + (prompt-at tag v) -> v + (prompt-at tag E[(control-at tag' k body)]) -> (prompt-at tag ((λ (k) body) (λ (x) E[x]))) + +where `tag = tag'` and `E` does not contain `prompt-at tag`. So the +only difference between prompt-at/control-at and prompt/control is the +presence of [prompt tags](https://docs.racket-lang.org/reference/cont.html#%28def._%28%28quote._~23~25kernel%29._continuation-prompt-tag~3f%29%29) which allows for a more fine-grained matching +between delimiters and control operators. + +We can sort of fix our while macro by creating a special tag + + (define while-tag (make-continuation-prompt-tag 'tagje)) + (define-syntax-rule (break) (control-at while-tag k '())) + (define-syntax-rule (while cond body ...) + (prompt-at while-tag + (let loop () + (when cond + body ... + (loop))))) + +The following program then prints “Hi” only once. + + (define i 0) + (while (< i 3) (writeln "Hi") (prompt (break)) (set! i (+ i 1))) + + +## Reset/shift + +The other pair of delimited control operators are `shift` and `reset`. + +[The reductions](https://docs.racket-lang.org/reference/cont.html#%28form._%28%28lib._racket%2Fcontrol..rkt%29._reset%29%29) for `reset/shift` are as follows. + + (reset v) -> v + (reset E[(shift k body)]) -> (reset ((λ (k) body) (λ (x) (reset E[x])))) + +where the evaluation context E does not contain `reset` and `x` is a +free variable in E. Contrast this with the reduction rules for +`prompt/control` + + (prompt v) -> v + (prompt E[(control k body)]) -> (prompt ((λ (k) body) (λ (x) E[x]))) + +As you can notice, the difference between the `prompt/control` +reductions is that in the case when the continuation is captured, it +is wrapped in an additional `reset`. Thus, any invocation of a bound +delimited continuation `k` cannot escape to the outer scope. + +To observe the practical difference, consider the following example. + + (define (skip) '()) + (define (bye) (println "Capturing and discarding the continuation...") 42) + (prompt + (let ([act (control k (begin + (k skip) + (k (λ () (control _ (bye)))) + (k skip)))]) + (act) + (println "Doing stuff"))) + +If we were to remove the second line `(k (λ () (control _ (bye))))` in +the begin block, then this program would print “Doing stuff” twice (as +invoking `(k skip)` binds the dummy function `skip` to `act` and +executes `(begin (act) (println "Doing stuff"))`). With such a line +present, during the invocation of `k`, `act` gets bound to `(λ () +(control _ (bye)))`. Therefore, when `act` is evaluated the +continuation is of the form `(prompt E[(control _ bye)])`, which just +reduces to `(prompt (bye))` and `(prompt 42)`. So, the output of the +program above is + + "Doing stuff" + "Capturing and discarding the continuation..." + 42 + +If we replace `prompt` with `reset` and `control` with `shift`, as in +the code snippet below, then every invocation of `k` wraps the +continuation in another `reset`. + + (reset + (let ([act (shift k (begin + (k skip) + (k (λ () (shift _ (bye)))) + (k skip)))]) + (act) + (println "Doing stuff"))) + +After some reductions, the terms is + + (reset + ((λ (k) (begin (k skip) (k (λ () (shift _ (bye)))) (K skip))) + (λ (x) (reset (begin (x) (println "Doing stuff")))))) + +As you can see, the second invocation of `k` results in `(reset (begin +(shift _ (bye)) (println "Doing stuff")))`, and thus + +1. “Doing stuff” does not get printed (as this part of the term gets discarded. +2. The shift operator discards the *inner* continuation, not the outer + continuation. In particular, that means that the call to `(k (λ () + (shift _ (bye))))` returns! + +The output of the snippet is thus + + "Doing stuff" + "Capturing and discarding the continuation..." + "Doing stuff" + + +## Comments and further reading + +See references at . + +Some interesting papers: + +- [Shift to control](http://homes.soic.indiana.edu/ccshan/recur/recur.pdf) by C.C. Shan shows how to express shift/reset and control/prompt in terms of each other. +- [On the Dynamic Extent of Delimited Continuations](http://www.brics.dk/RS/05/2/BRICS-RS-05-2.pdf) by Dariusz + Biernacki and Olivier Danvy present the difference between shift and + control using breadth-first traversal as an example. It is also + explained in which sense `shift` is a static delimited control + operator and `control` is a dynamic delimited control operator. +- A bibliography of [Continuations and Continuation Passing Style](http://library.readscheme.org/page6.html) at the readscheme.org diff --git a/content/ghc-api.md b/content/ghc-api.md new file mode 100644 index 0000000..3691faa --- /dev/null +++ b/content/ghc-api.md @@ -0,0 +1,460 @@ +title: GHC API tutorial +date: 2014-06-18 00:00 +tags: haskell +--- + +# Intro + +*Disclamer: these notes have been written a couple of years ago. While + some of the basic facts are still the case, some details may be + subject to change. Trust, but verify!* + +It’s hard to get into writing code that uses GHC API. The API itself is and the number of various functions and options significantly outnumber the amount of tutorials around. + +In this note I will try to elaborate on some of the peculiar, interesting problems I’ve encountered during my experience writing code that uses GHC API and also provide various tips I find useful. + +Many of the points I make in this note are actually trivial, but nevertheless I made all of the mistakes mentioned in here myself, perhaps due to my naive approach of quickly diving in and experimenting, instead of reading into the documentation and source code. + +This note is an adaptation of a [series of my blog posts](http://parenz.wordpress.com/tag/ghc/) on the subject. + +In order to run examples presented in this note you'll have to + +1. Install the `ghc-paths` library +2. Compile the code with `ghc -package ghc file.hs` OR +3. Expose the `ghc-7.8.x` library: `ghc-pkg expose ghc-7.8.2` + +The up to date Haddocks for GHC are located [here](http://www.haskell.org/ghc/docs/latest/html/libraries/ghc/index.html). + +# Interpreted, compiled, and package modules + +There are different ways of bringing contents of Haskell modules into scope, a process that is necessary for evaluating/interpreting bits of code on-the-fly. We will walk through some of the caveats of this process. + +## Interpreted modules + +Imagine the following situation: we have a Haskell source file with code we want to load dynamically and evaluate. That is a basic task in the GHC API terms, but there are actually different ways of doing that. + +Let us have a file 'test.hs' containing the code we want to access: + +```haskell +module Test (test) where +test :: Int +test = 123 +``` + +The basic way to get the 'test' data would be to load `Test` as an interpreted module: + +```haskell +import Control.Applicative +import DynFlags +import GHC +import GHC.Paths +import MonadUtils (liftIO) +import Unsafe.Coerce + +main = defaultErrorHandler defaultFatalMessager defaultFlushOut $ + runGhc (Just libdir) $ do + -- we have to call 'setSessionDynFlags' before doing + -- everything else + dflags <- getSessionDynFlags + -- If we want to make GHC interpret our code on the fly, we + -- ought to set those two flags, otherwise we + -- wouldn't be able to use 'setContext' below + setSessionDynFlags $ dflags { hscTarget = HscInterpreted + , ghcLink = LinkInMemory + } + setTargets =<< sequence [guessTarget "test.hs" Nothing] + load LoadAllTargets + -- Bringing the module into the context + setContext [IIModule $ mkModuleName "Test"] + -- evaluating and running an action + act <- unsafeCoerce <$> compileExpr "print test" + liftIO act +``` + +The reason that we have to use `HscInterpreted` and `LinkInMemory` is that otherwise it would compile test.hs in the current directory and leave test.hi and test.o files, which we would not be able to load in the interpreted mode. However, `setContext`, will try to bring the code in those files first, when looking for the module `Test`. + + $ ghc -package ghc --make Interp.hs + [1 of 1] Compiling Main ( Interp.hs, Interp.o ) + Linking Interp ... + $ ./Interp + 123 + +Yay, it works! But let's try something fancier like printing a list of integers, one-by-one. For that, we will use an awesome `forM_` function. + + --- the rest of the file is the same + --- ... + -- evaluating and running an action + act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print" + liftIO act + + +When we try to run it: + + $ ./Interp + Interp: panic! (the 'impossible' happened) + (GHC version 7.8.2 for x86_64-apple-darwin): + Not in scope: ‘forM_’ + +Well, fair enough, where can we expect GHC to get the `forM_` from? We have to bring `Control.Monad` into the scope in order to do that. + +This brings us to the next section + +## Package modules + +Naively, we might want to load `Control.Monad` in a similar fashion as we did with loading `test.hs`. + +```haskell +main = defaultErrorHandler defaultFatalMessager defaultFlushOut $ + runGhc (Just libdir) $ do + dflags <- getSessionDynFlags + setSessionDynFlags $ dflags { hscTarget = HscInterpreted + , ghcLink = LinkInMemory + } + setTargets =<< sequence [ guessTarget "test.hs" Nothing + , guessTarget "Control.Monad" Nothing] + load LoadAllTargets + -- Bringing the module into the context + setContext [IIModule $ mkModuleName "Test"] + + -- evaluating and running an action + act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print" + liftIO act +``` + +Unfortunately, this attempt fails: + + Interp: panic! (the 'impossible' happened) + (GHC version 7.8.2 for x86_64-apple-darwin): + module ‘Control.Monad’ is a package module + +Huh, what? I thought `guessTarget` works on all kinds of modules. + +Well, it does. But it doesn't "load the module", it merely sets it as the *target for compilation*. Basically, it (together with `load LoadAllTargets`) is equivalent to calling `ghc --make`. And surely it doesn't make much sense trying to `ghc --make Control.Monad` when `Control.Monad` is a module from the base package. What we need to do instead is to bring the compiled `Control.Monad` module into scope. Luckily it's not very hard to do with the help of the `simpleImportDecl :: ModuleName -> ImportDecl name` function: + +```haskell +main = defaultErrorHandler defaultFatalMessager defaultFlushOut $ + runGhc (Just libdir) $ do + -- we have to call 'setSessionDynFlags' before doing + -- everything else + dflags <- getSessionDynFlags + -- If we want to make GHC interpret our code on the fly, we + -- ought to set those two flags, otherwise we + -- wouldn't be able to use 'setContext' below + setSessionDynFlags $ dflags { hscTarget = HscInterpreted + , ghcLink = LinkInMemory + } + setTargets =<< sequence [ guessTarget "test.hs" Nothing ] + load LoadAllTargets + -- Bringing the module into the context + setContext [ IIModule $ mkModuleName "Test" + , IIDecl . simpleImportDecl + . mkModuleName $ "Control.Monad" ] + -- evaluating and running an action + act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print" + liftIO act +``` + +And this works like a charm: + + $ ./Interp + 1 + 2 + 123 + + +## Compiled modules + +What we have implemented so far corresponds to the `:load* test.hs` +command in GHCi, which gives us the full access to the source code of +the program. To illustrate this let's modify our test file: + +```haskell +module Test (test) where + +test :: Int +test = 123 + +test2 :: String +test2 = "Hi" +``` + +Now, if we want to load that file as an interpreted module and evaluate +`test2` nothing will stop us from doing so. + + $ ./Interp2 + (123,"Hi") + + +If we want to use the compiled module (like `:load test.hs` in GHCi), +we have to bring `Test` into the context the same way we dealt with +`Control.Monad`: + +```haskell +main = defaultErrorHandler defaultFatalMessager defaultFlushOut $ + runGhc (Just libdir) $ do + dflags <- getSessionDynFlags + setSessionDynFlags $ dflags { hscTarget = HscInterpreted + , ghcLink = LinkInMemory + } + setTargets =<< sequence [ guessTarget "Test" Nothing ] + load LoadAllTargets + -- Bringing the module into the context + setContext [ IIDecl $ simpleImportDecl (mkModuleName "Test") + , IIDecl $ simpleImportDecl (mkModuleName "Prelude") + ] + printExpr "test" + printExpr "test2" + + +printExpr :: String -> Ghc () +printExpr expr = do + liftIO $ putStrLn ("-- Going to print " ++ expr) + act <- unsafeCoerce <$> compileExpr ("print (" ++ expr ++ ")") + liftIO act +``` + +The output: + + + $ ./Interp2 + -- Going to print test + 123 + -- Going to print test2 + target: panic! (the 'impossible' happened) + (GHC version 7.6.3 for x86_64-apple-darwin): + Not in scope: `test2' + Perhaps you meant `test' (imported from Test) + + Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug + +# Error handling + +Our exercises above produced a number of GHC panics/erros. By default +you can expect GHC to spew all the errors onto your screen, but for +different purposes you might want to, e.g. log them. + +Naturally at first I tried the exception handling mechanism: + +```haskell +-- Main.hs: +import GHC +import GHC.Paths +import MonadUtils +import Exception +import Panic +import Unsafe.Coerce +import System.IO.Unsafe + +-- I thought this code would handle the exception +handleException :: (ExceptionMonad m, MonadIO m) + => m a -> m (Either String a) +handleException m = + ghandle (\(ex :: SomeException) -> return (Left (show ex))) $ + handleGhcException (\ge -> return (Left (showGhcException ge ""))) $ + flip gfinally (liftIO restoreHandlers) $ + m >>= return . Right + +-- Initializations, needed if you want to compile code on the fly +initGhc :: Ghc () +initGhc = do + dfs <- getSessionDynFlags + setSessionDynFlags $ dfs { hscTarget = HscInterpreted + , ghcLink = LinkInMemory } + return () + + +-- main entry point +main = test >>= print + +test :: IO (Either String Int) +test = handleException $ runGhc (Just libdir) $ do + initGhc + setTargets =<< sequence [ guessTarget "file1.hs" Nothing ] + graph <- depanal [] False + loaded <- load LoadAllTargets + -- when (failed loaded) $ throw LoadingException + setContext (map (IIModule . moduleName . ms_mod) graph) + let expr = "run" + res <- unsafePerformIO . unsafeCoerce <$> compileExpr expr + return res + + +-- file1.hs: +module Main where + +main = return () + +run :: IO Int +run = do + n <- x + return (n+1) +``` + +The problem is when I run the 'test' function above I receive the +following output: + + h> test + + test/file1.hs:4:10: Not in scope: `x' + + Left "Cannot add module Main to context: not a home module" + it :: Either String Int + + +It appears that the exception handler did cat *an* error, but a +peculiar one, and not that one I wanted to catch. + +## Solution + +We've studied the GHC API together with Luite Stegeman and I think +we've found a more or less satisfactory solution. + +Errors are handled using the +[LogAction](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:LogAction) +specified in the +[DynFlags](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:DynFlags) +for your GHC session. So to fix this you need to change 'log_action' +parameter in dynFlags. For example, you can do this: + +```haskell +initGhc = do + .. + ref <- liftIO $ newIORef "" + dfs <- getSessionDynFlags + setSessionDynFlags $ dfs { hscTarget = HscInterpreted + , ghcLink = LinkInMemory + , log_action = logHandler ref -- ^ this + } + +-- LogAction == DynFlags -> Severity -> SrcSpan -> PprStyle -> MsgDoc -> IO () +logHandler :: IORef String -> LogAction +logHandler ref dflags severity srcSpan style msg = + case severity of + SevError -> modifyIORef' ref (++ printDoc) + SevFatal -> modifyIORef' ref (++ printDoc) + _ -> return () -- ignore the rest + where cntx = initSDocContext dflags style + locMsg = mkLocMessage severity srcSpan msg + printDoc = show (runSDoc locMsg cntx) +``` + +# Package databases + +A *package database* is a directory where the information about your +installed packages is stored. For each package registered in the +database there is a .conf file with the package details. The .conf +file contains the package description (just like in the .cabal file) +as well as path to binaries and a list of resolved dependencies: + + $ cat aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a.conf + name: aeson + version: 0.6.1.0.1 + id: aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a + <..snip..> + import-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722 + library-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722 + <..snip..> + depends: attoparsec-0.10.4.0-acffb7126aca47a107cf7722d75f1f5e + base-4.7.0.0-b67b4d8660168c197a2f385a9347434d + blaze-builder-0.3.1.1-9fd49ac1608ca25e284a8ac6908d5148 + bytestring-0.10.3.0-66e3f5813c3dc8ef9647156d1743f0ef + <..snip..> + +You can use `ghc-pkg` to manage installed packages on your system. For +example, to list all the packages you've installed run `ghc-pkg list`. +To list all the package databases that are automatically picked up by +`ghc-pkg` do the following: + + $ ghc-pkg nonexistentpkg + /home/dan/ghc/lib/ghc-7.7.20130722/package.conf.d + /home/dan/.ghc/i386-linux-7.7.20130722/package.conf.d + +See `ghc-pkg --help` or the [online documentation](http://www.haskell.org/ghc/docs/latest/html/users_guide/packages.html#package-management) for more details. + +## Adding a package db + +By default GHC knows only about two package databases: the global +package database (usually `/usr/lib/ghc-something/` on Linux) and the +user-specific database (usually `~/.ghc/lib`). In order to pick up a +package that resides in a different package database you have to +employ some tricks. + +For some reason GHC API does not export an clear and easy-to-use +function that would allow you to do that, although the code we need is +present in the GHC sources. + +The way this whole thing works is the following: + +1. GHC calls [initPackages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:initPackages), +which reads the database files and sets up the [internal +table](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#t:PackageState) +of package information + +2. The reading of package databases is performed via the +[readPackageConfigs](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:readPackageConfigs) +function. It reads the user package database, the global package +database, the "GHC_PACKAGE_PATH" environment variable, and *applies +the extraPkgConfs function*, which is a dynflag and has the following +type: `extraPkgConfs :: [PkgConfRef] -> [PkgConfRef]` +([PkgConfRef](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:PkgConfRef) +is a type representing the package database). The `extraPkgConf` flag +is supposed to represent the `-package-db` command line option. + +3. Once the database is parsed, the loaded packages are stored in the +`pkgDatabase` dynflag which is a list of [PackageConfig](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/PackageConfig.html#t:PackageConfig)s + + +So, in order to add a package database to the current session we have +to simply modify the `extraPkgConfs` dynflag. Actually, there is +already a function [present](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/src/DynFlags.html#line-3656) in the GHC source that does exactly what we +need: `addPkgConfRef :: PkgConfRef -> DynP ()`. Unfortunately it's not +exported so we can't use it in our own code. I rolled my own functions +that I am using in the interactive-diagrams project, feel free to copy +them: + +```haskell +-- | Add a package database to the Ghc monad +#if __GLASGOW_HASKELL_ >= 707 +addPkgDb :: GhcMonad m => FilePath -> m () +#else +addPkgDb :: (MonadIO m, GhcMonad m) => FilePath -> m () +#endif +addPkgDb fp = do + dfs <- getSessionDynFlags + let pkg = PkgConfFile fp + let dfs' = dfs { extraPkgConfs = (pkg:) . extraPkgConfs dfs } + setSessionDynFlags dfs' +#if __GLASGOW_HASKELL_ >= 707 + _ <- initPackages dfs' +#else + _ <- liftIO $ initPackages dfs' +#endif + return () + +-- | Add a list of package databases to the Ghc monad +-- This should be equivalen to +-- > addPkgDbs ls = mapM_ addPkgDb ls +-- but it is actaully faster, because it does the package +-- reintialization after adding all the databases +#if __GLASGOW_HASKELL_ >= 707 +addPkgDbs :: GhcMonad m => [FilePath] -> m () +#else +addPkgDbs :: (MonadIO m, GhcMonad m) => [FilePath] -> m () +#endif +addPkgDbs fps = do + dfs <- getSessionDynFlags + let pkgs = map PkgConfFile fps + let dfs' = dfs { extraPkgConfs = (pkgs ++) . extraPkgConfs dfs } + setSessionDynFlags dfs' +#if __GLASGOW_HASKELL_ >= 707 + _ <- initPackages dfs' +#else + _ <- liftIO $ initPackages dfs' +#endif + return () +``` + +## Links + +- [Packages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html) module, + contains other functions that modify/make use of `extraPkgConfs` diff --git a/content/omitting-types.md b/content/omitting-types.md new file mode 100644 index 0000000..5e3f420 --- /dev/null +++ b/content/omitting-types.md @@ -0,0 +1,50 @@ +title: Omitting types +date: 2016-03-15 00:00 +tags: logic +--- + +**Definition.** A *type* $p(x)$ for the theory $T$ is a collection of + formulas with a free variable $x$ consitent with $T$. + +-- see Michael Weiss' [explanation of types](https://diagonalargument.com/2019/05/06/non-standard-models-of-arithmetic-1/). + +**Definition.** A type $p(x)$ is *principal* if there is a + formula $\alpha(x)$ such that $p(x)=\\{ \psi(x) \mid \alpha \vdash \psi \\}$ + +**Definition.** A type $p(x)$ is *full* if for every +formula $\psi(x)$ either $\psi \in p(x)$ or $\neg \psi \in p(x)$. + +A full type can be seen as a way of describing an a generic "element" +of a theory. For any model $M$ and an element $m \in M$ we can +consider the full type $p(m)$ generated by $m$. It contains all the formlas +$\psi(m)$ that are true about $m$. + +[Omitting types theorem](https://en.wikipedia.org/wiki/Type_(model_theory)#The_omitting_types_theorem) +says that given a _complete_ countable theory $T$ in a countable +language, there is a model $M$ of $T$ which omits countably-many +non-principal full types. + +Why is the non-principal condition needed? + +If a theory $T$ is complete, then any model of $T$ realizes all the +full principal types. For if $p(x)$ is a complete principal isolated +by $\psi(x)$, then $T \not\vdash \not \exists x.\psi(x)$, for otherwise the +type $p(x)$ would be inconsistent with $T$. Because $T$ is complete it +must be the case that $T \vdash \exists x. \psi(x)$. Then such $x$ +realizes $p(x)$. + +What happens if the theory $T$ is not complete? + +Then it is not the case that all +models of $T$ realize all the full principle types. For instance take +Peano arithmetic, and take $T_1 = PA + Con(PA)$ and $T_2 = PA + \neg +Con(PA)$. Both $T_1$ and $T_2$ are consistent, thus they have models +$M_1$ and $M_2$. Then denote by $q(x)$ the type generated by the +element $0$ in $M_1$, and by $p(x)$ the type generated by the element +$0$ in $M_2$. Both of those types are isolated by the formula $\phi(x) +:= \neg \exists y. y < x$, because $\phi$ completely determines $0$. +Albeit those types are types for different theories, they are both +types for a common theory $PA$. Furthemore, those types are complete +and isolated. However, one of the types states the consistency of $PA$ +and the other one points out the inconsistency of $PA$. Thus both $q$ +and $p$ cannot be realized in the same model.