461 lines
17 KiB
Markdown
461 lines
17 KiB
Markdown
title: GHC API tutorial
|
||
date: 2014-06-18 00:00
|
||
tags: haskell
|
||
---
|
||
|
||
# Intro
|
||
|
||
*Disclamer: these notes have been written a couple of years ago. While
|
||
some of the basic facts are still the case, some details may be
|
||
subject to change. Trust, but verify!*
|
||
|
||
It’s hard to get into writing code that uses GHC API. The API itself is and the number of various functions and options significantly outnumber the amount of tutorials around.
|
||
|
||
In this note I will try to elaborate on some of the peculiar, interesting problems I’ve encountered during my experience writing code that uses GHC API and also provide various tips I find useful.
|
||
|
||
Many of the points I make in this note are actually trivial, but nevertheless I made all of the mistakes mentioned in here myself, perhaps due to my naive approach of quickly diving in and experimenting, instead of reading into the documentation and source code.
|
||
|
||
This note is an adaptation of a [series of my blog posts](http://parenz.wordpress.com/tag/ghc/) on the subject.
|
||
|
||
In order to run examples presented in this note you'll have to
|
||
|
||
1. Install the `ghc-paths` library
|
||
2. Compile the code with `ghc -package ghc file.hs` OR
|
||
3. Expose the `ghc-7.8.x` library: `ghc-pkg expose ghc-7.8.2`
|
||
|
||
The up to date Haddocks for GHC are located [here](http://www.haskell.org/ghc/docs/latest/html/libraries/ghc/index.html).
|
||
|
||
# Interpreted, compiled, and package modules
|
||
|
||
There are different ways of bringing contents of Haskell modules into scope, a process that is necessary for evaluating/interpreting bits of code on-the-fly. We will walk through some of the caveats of this process.
|
||
|
||
## Interpreted modules
|
||
|
||
Imagine the following situation: we have a Haskell source file with code we want to load dynamically and evaluate. That is a basic task in the GHC API terms, but there are actually different ways of doing that.
|
||
|
||
Let us have a file 'test.hs' containing the code we want to access:
|
||
|
||
```haskell
|
||
module Test (test) where
|
||
test :: Int
|
||
test = 123
|
||
```
|
||
|
||
The basic way to get the 'test' data would be to load `Test` as an interpreted module:
|
||
|
||
```haskell
|
||
import Control.Applicative
|
||
import DynFlags
|
||
import GHC
|
||
import GHC.Paths
|
||
import MonadUtils (liftIO)
|
||
import Unsafe.Coerce
|
||
|
||
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
||
runGhc (Just libdir) $ do
|
||
-- we have to call 'setSessionDynFlags' before doing
|
||
-- everything else
|
||
dflags <- getSessionDynFlags
|
||
-- If we want to make GHC interpret our code on the fly, we
|
||
-- ought to set those two flags, otherwise we
|
||
-- wouldn't be able to use 'setContext' below
|
||
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory
|
||
}
|
||
setTargets =<< sequence [guessTarget "test.hs" Nothing]
|
||
load LoadAllTargets
|
||
-- Bringing the module into the context
|
||
setContext [IIModule $ mkModuleName "Test"]
|
||
-- evaluating and running an action
|
||
act <- unsafeCoerce <$> compileExpr "print test"
|
||
liftIO act
|
||
```
|
||
|
||
The reason that we have to use `HscInterpreted` and `LinkInMemory` is that otherwise it would compile test.hs in the current directory and leave test.hi and test.o files, which we would not be able to load in the interpreted mode. However, `setContext`, will try to bring the code in those files first, when looking for the module `Test`.
|
||
|
||
$ ghc -package ghc --make Interp.hs
|
||
[1 of 1] Compiling Main ( Interp.hs, Interp.o )
|
||
Linking Interp ...
|
||
$ ./Interp
|
||
123
|
||
|
||
Yay, it works! But let's try something fancier like printing a list of integers, one-by-one. For that, we will use an awesome `forM_` function.
|
||
|
||
--- the rest of the file is the same
|
||
--- ...
|
||
-- evaluating and running an action
|
||
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
||
liftIO act
|
||
|
||
|
||
When we try to run it:
|
||
|
||
$ ./Interp
|
||
Interp: panic! (the 'impossible' happened)
|
||
(GHC version 7.8.2 for x86_64-apple-darwin):
|
||
Not in scope: ‘forM_’
|
||
|
||
Well, fair enough, where can we expect GHC to get the `forM_` from? We have to bring `Control.Monad` into the scope in order to do that.
|
||
|
||
This brings us to the next section
|
||
|
||
## Package modules
|
||
|
||
Naively, we might want to load `Control.Monad` in a similar fashion as we did with loading `test.hs`.
|
||
|
||
```haskell
|
||
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
||
runGhc (Just libdir) $ do
|
||
dflags <- getSessionDynFlags
|
||
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory
|
||
}
|
||
setTargets =<< sequence [ guessTarget "test.hs" Nothing
|
||
, guessTarget "Control.Monad" Nothing]
|
||
load LoadAllTargets
|
||
-- Bringing the module into the context
|
||
setContext [IIModule $ mkModuleName "Test"]
|
||
|
||
-- evaluating and running an action
|
||
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
||
liftIO act
|
||
```
|
||
|
||
Unfortunately, this attempt fails:
|
||
|
||
Interp: panic! (the 'impossible' happened)
|
||
(GHC version 7.8.2 for x86_64-apple-darwin):
|
||
module ‘Control.Monad’ is a package module
|
||
|
||
Huh, what? I thought `guessTarget` works on all kinds of modules.
|
||
|
||
Well, it does. But it doesn't "load the module", it merely sets it as the *target for compilation*. Basically, it (together with `load LoadAllTargets`) is equivalent to calling `ghc --make`. And surely it doesn't make much sense trying to `ghc --make Control.Monad` when `Control.Monad` is a module from the base package. What we need to do instead is to bring the compiled `Control.Monad` module into scope. Luckily it's not very hard to do with the help of the `simpleImportDecl :: ModuleName -> ImportDecl name` function:
|
||
|
||
```haskell
|
||
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
||
runGhc (Just libdir) $ do
|
||
-- we have to call 'setSessionDynFlags' before doing
|
||
-- everything else
|
||
dflags <- getSessionDynFlags
|
||
-- If we want to make GHC interpret our code on the fly, we
|
||
-- ought to set those two flags, otherwise we
|
||
-- wouldn't be able to use 'setContext' below
|
||
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory
|
||
}
|
||
setTargets =<< sequence [ guessTarget "test.hs" Nothing ]
|
||
load LoadAllTargets
|
||
-- Bringing the module into the context
|
||
setContext [ IIModule $ mkModuleName "Test"
|
||
, IIDecl . simpleImportDecl
|
||
. mkModuleName $ "Control.Monad" ]
|
||
-- evaluating and running an action
|
||
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
||
liftIO act
|
||
```
|
||
|
||
And this works like a charm:
|
||
|
||
$ ./Interp
|
||
1
|
||
2
|
||
123
|
||
|
||
|
||
## Compiled modules
|
||
|
||
What we have implemented so far corresponds to the `:load* test.hs`
|
||
command in GHCi, which gives us the full access to the source code of
|
||
the program. To illustrate this let's modify our test file:
|
||
|
||
```haskell
|
||
module Test (test) where
|
||
|
||
test :: Int
|
||
test = 123
|
||
|
||
test2 :: String
|
||
test2 = "Hi"
|
||
```
|
||
|
||
Now, if we want to load that file as an interpreted module and evaluate
|
||
`test2` nothing will stop us from doing so.
|
||
|
||
$ ./Interp2
|
||
(123,"Hi")
|
||
|
||
|
||
If we want to use the compiled module (like `:load test.hs` in GHCi),
|
||
we have to bring `Test` into the context the same way we dealt with
|
||
`Control.Monad`:
|
||
|
||
```haskell
|
||
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
||
runGhc (Just libdir) $ do
|
||
dflags <- getSessionDynFlags
|
||
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory
|
||
}
|
||
setTargets =<< sequence [ guessTarget "Test" Nothing ]
|
||
load LoadAllTargets
|
||
-- Bringing the module into the context
|
||
setContext [ IIDecl $ simpleImportDecl (mkModuleName "Test")
|
||
, IIDecl $ simpleImportDecl (mkModuleName "Prelude")
|
||
]
|
||
printExpr "test"
|
||
printExpr "test2"
|
||
|
||
|
||
printExpr :: String -> Ghc ()
|
||
printExpr expr = do
|
||
liftIO $ putStrLn ("-- Going to print " ++ expr)
|
||
act <- unsafeCoerce <$> compileExpr ("print (" ++ expr ++ ")")
|
||
liftIO act
|
||
```
|
||
|
||
The output:
|
||
|
||
|
||
$ ./Interp2
|
||
-- Going to print test
|
||
123
|
||
-- Going to print test2
|
||
target: panic! (the 'impossible' happened)
|
||
(GHC version 7.6.3 for x86_64-apple-darwin):
|
||
Not in scope: `test2'
|
||
Perhaps you meant `test' (imported from Test)
|
||
|
||
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
|
||
|
||
# Error handling
|
||
|
||
Our exercises above produced a number of GHC panics/erros. By default
|
||
you can expect GHC to spew all the errors onto your screen, but for
|
||
different purposes you might want to, e.g. log them.
|
||
|
||
Naturally at first I tried the exception handling mechanism:
|
||
|
||
```haskell
|
||
-- Main.hs:
|
||
import GHC
|
||
import GHC.Paths
|
||
import MonadUtils
|
||
import Exception
|
||
import Panic
|
||
import Unsafe.Coerce
|
||
import System.IO.Unsafe
|
||
|
||
-- I thought this code would handle the exception
|
||
handleException :: (ExceptionMonad m, MonadIO m)
|
||
=> m a -> m (Either String a)
|
||
handleException m =
|
||
ghandle (\(ex :: SomeException) -> return (Left (show ex))) $
|
||
handleGhcException (\ge -> return (Left (showGhcException ge ""))) $
|
||
flip gfinally (liftIO restoreHandlers) $
|
||
m >>= return . Right
|
||
|
||
-- Initializations, needed if you want to compile code on the fly
|
||
initGhc :: Ghc ()
|
||
initGhc = do
|
||
dfs <- getSessionDynFlags
|
||
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory }
|
||
return ()
|
||
|
||
|
||
-- main entry point
|
||
main = test >>= print
|
||
|
||
test :: IO (Either String Int)
|
||
test = handleException $ runGhc (Just libdir) $ do
|
||
initGhc
|
||
setTargets =<< sequence [ guessTarget "file1.hs" Nothing ]
|
||
graph <- depanal [] False
|
||
loaded <- load LoadAllTargets
|
||
-- when (failed loaded) $ throw LoadingException
|
||
setContext (map (IIModule . moduleName . ms_mod) graph)
|
||
let expr = "run"
|
||
res <- unsafePerformIO . unsafeCoerce <$> compileExpr expr
|
||
return res
|
||
|
||
|
||
-- file1.hs:
|
||
module Main where
|
||
|
||
main = return ()
|
||
|
||
run :: IO Int
|
||
run = do
|
||
n <- x
|
||
return (n+1)
|
||
```
|
||
|
||
The problem is when I run the 'test' function above I receive the
|
||
following output:
|
||
|
||
h> test
|
||
|
||
test/file1.hs:4:10: Not in scope: `x'
|
||
|
||
Left "Cannot add module Main to context: not a home module"
|
||
it :: Either String Int
|
||
|
||
|
||
It appears that the exception handler did cat *an* error, but a
|
||
peculiar one, and not that one I wanted to catch.
|
||
|
||
## Solution
|
||
|
||
We've studied the GHC API together with Luite Stegeman and I think
|
||
we've found a more or less satisfactory solution.
|
||
|
||
Errors are handled using the
|
||
[LogAction](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:LogAction)
|
||
specified in the
|
||
[DynFlags](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:DynFlags)
|
||
for your GHC session. So to fix this you need to change 'log_action'
|
||
parameter in dynFlags. For example, you can do this:
|
||
|
||
```haskell
|
||
initGhc = do
|
||
..
|
||
ref <- liftIO $ newIORef ""
|
||
dfs <- getSessionDynFlags
|
||
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
|
||
, ghcLink = LinkInMemory
|
||
, log_action = logHandler ref -- ^ this
|
||
}
|
||
|
||
-- LogAction == DynFlags -> Severity -> SrcSpan -> PprStyle -> MsgDoc -> IO ()
|
||
logHandler :: IORef String -> LogAction
|
||
logHandler ref dflags severity srcSpan style msg =
|
||
case severity of
|
||
SevError -> modifyIORef' ref (++ printDoc)
|
||
SevFatal -> modifyIORef' ref (++ printDoc)
|
||
_ -> return () -- ignore the rest
|
||
where cntx = initSDocContext dflags style
|
||
locMsg = mkLocMessage severity srcSpan msg
|
||
printDoc = show (runSDoc locMsg cntx)
|
||
```
|
||
|
||
# Package databases
|
||
|
||
A *package database* is a directory where the information about your
|
||
installed packages is stored. For each package registered in the
|
||
database there is a .conf file with the package details. The .conf
|
||
file contains the package description (just like in the .cabal file)
|
||
as well as path to binaries and a list of resolved dependencies:
|
||
|
||
$ cat aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a.conf
|
||
name: aeson
|
||
version: 0.6.1.0.1
|
||
id: aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a
|
||
<..snip..>
|
||
import-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
|
||
library-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
|
||
<..snip..>
|
||
depends: attoparsec-0.10.4.0-acffb7126aca47a107cf7722d75f1f5e
|
||
base-4.7.0.0-b67b4d8660168c197a2f385a9347434d
|
||
blaze-builder-0.3.1.1-9fd49ac1608ca25e284a8ac6908d5148
|
||
bytestring-0.10.3.0-66e3f5813c3dc8ef9647156d1743f0ef
|
||
<..snip..>
|
||
|
||
You can use `ghc-pkg` to manage installed packages on your system. For
|
||
example, to list all the packages you've installed run `ghc-pkg list`.
|
||
To list all the package databases that are automatically picked up by
|
||
`ghc-pkg` do the following:
|
||
|
||
$ ghc-pkg nonexistentpkg
|
||
/home/dan/ghc/lib/ghc-7.7.20130722/package.conf.d
|
||
/home/dan/.ghc/i386-linux-7.7.20130722/package.conf.d
|
||
|
||
See `ghc-pkg --help` or the [online documentation](http://www.haskell.org/ghc/docs/latest/html/users_guide/packages.html#package-management) for more details.
|
||
|
||
## Adding a package db
|
||
|
||
By default GHC knows only about two package databases: the global
|
||
package database (usually `/usr/lib/ghc-something/` on Linux) and the
|
||
user-specific database (usually `~/.ghc/lib`). In order to pick up a
|
||
package that resides in a different package database you have to
|
||
employ some tricks.
|
||
|
||
For some reason GHC API does not export an clear and easy-to-use
|
||
function that would allow you to do that, although the code we need is
|
||
present in the GHC sources.
|
||
|
||
The way this whole thing works is the following:
|
||
|
||
1. GHC calls [initPackages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:initPackages),
|
||
which reads the database files and sets up the [internal
|
||
table](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#t:PackageState)
|
||
of package information
|
||
|
||
2. The reading of package databases is performed via the
|
||
[readPackageConfigs](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:readPackageConfigs)
|
||
function. It reads the user package database, the global package
|
||
database, the "GHC_PACKAGE_PATH" environment variable, and *applies
|
||
the extraPkgConfs function*, which is a dynflag and has the following
|
||
type: `extraPkgConfs :: [PkgConfRef] -> [PkgConfRef]`
|
||
([PkgConfRef](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:PkgConfRef)
|
||
is a type representing the package database). The `extraPkgConf` flag
|
||
is supposed to represent the `-package-db` command line option.
|
||
|
||
3. Once the database is parsed, the loaded packages are stored in the
|
||
`pkgDatabase` dynflag which is a list of [PackageConfig](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/PackageConfig.html#t:PackageConfig)s
|
||
|
||
|
||
So, in order to add a package database to the current session we have
|
||
to simply modify the `extraPkgConfs` dynflag. Actually, there is
|
||
already a function [present](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/src/DynFlags.html#line-3656) in the GHC source that does exactly what we
|
||
need: `addPkgConfRef :: PkgConfRef -> DynP ()`. Unfortunately it's not
|
||
exported so we can't use it in our own code. I rolled my own functions
|
||
that I am using in the interactive-diagrams project, feel free to copy
|
||
them:
|
||
|
||
```haskell
|
||
-- | Add a package database to the Ghc monad
|
||
#if __GLASGOW_HASKELL_ >= 707
|
||
addPkgDb :: GhcMonad m => FilePath -> m ()
|
||
#else
|
||
addPkgDb :: (MonadIO m, GhcMonad m) => FilePath -> m ()
|
||
#endif
|
||
addPkgDb fp = do
|
||
dfs <- getSessionDynFlags
|
||
let pkg = PkgConfFile fp
|
||
let dfs' = dfs { extraPkgConfs = (pkg:) . extraPkgConfs dfs }
|
||
setSessionDynFlags dfs'
|
||
#if __GLASGOW_HASKELL_ >= 707
|
||
_ <- initPackages dfs'
|
||
#else
|
||
_ <- liftIO $ initPackages dfs'
|
||
#endif
|
||
return ()
|
||
|
||
-- | Add a list of package databases to the Ghc monad
|
||
-- This should be equivalen to
|
||
-- > addPkgDbs ls = mapM_ addPkgDb ls
|
||
-- but it is actaully faster, because it does the package
|
||
-- reintialization after adding all the databases
|
||
#if __GLASGOW_HASKELL_ >= 707
|
||
addPkgDbs :: GhcMonad m => [FilePath] -> m ()
|
||
#else
|
||
addPkgDbs :: (MonadIO m, GhcMonad m) => [FilePath] -> m ()
|
||
#endif
|
||
addPkgDbs fps = do
|
||
dfs <- getSessionDynFlags
|
||
let pkgs = map PkgConfFile fps
|
||
let dfs' = dfs { extraPkgConfs = (pkgs ++) . extraPkgConfs dfs }
|
||
setSessionDynFlags dfs'
|
||
#if __GLASGOW_HASKELL_ >= 707
|
||
_ <- initPackages dfs'
|
||
#else
|
||
_ <- liftIO $ initPackages dfs'
|
||
#endif
|
||
return ()
|
||
```
|
||
|
||
## Links
|
||
|
||
- [Packages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html) module,
|
||
contains other functions that modify/make use of `extraPkgConfs`
|