461 lines
17 KiB
Markdown
461 lines
17 KiB
Markdown
|
title: GHC API tutorial
|
|||
|
date: 2014-06-18 00:00
|
|||
|
tags: haskell
|
|||
|
---
|
|||
|
|
|||
|
# Intro
|
|||
|
|
|||
|
*Disclamer: these notes have been written a couple of years ago. While
|
|||
|
some of the basic facts are still the case, some details may be
|
|||
|
subject to change. Trust, but verify!*
|
|||
|
|
|||
|
It’s hard to get into writing code that uses GHC API. The API itself is and the number of various functions and options significantly outnumber the amount of tutorials around.
|
|||
|
|
|||
|
In this note I will try to elaborate on some of the peculiar, interesting problems I’ve encountered during my experience writing code that uses GHC API and also provide various tips I find useful.
|
|||
|
|
|||
|
Many of the points I make in this note are actually trivial, but nevertheless I made all of the mistakes mentioned in here myself, perhaps due to my naive approach of quickly diving in and experimenting, instead of reading into the documentation and source code.
|
|||
|
|
|||
|
This note is an adaptation of a [series of my blog posts](http://parenz.wordpress.com/tag/ghc/) on the subject.
|
|||
|
|
|||
|
In order to run examples presented in this note you'll have to
|
|||
|
|
|||
|
1. Install the `ghc-paths` library
|
|||
|
2. Compile the code with `ghc -package ghc file.hs` OR
|
|||
|
3. Expose the `ghc-7.8.x` library: `ghc-pkg expose ghc-7.8.2`
|
|||
|
|
|||
|
The up to date Haddocks for GHC are located [here](http://www.haskell.org/ghc/docs/latest/html/libraries/ghc/index.html).
|
|||
|
|
|||
|
# Interpreted, compiled, and package modules
|
|||
|
|
|||
|
There are different ways of bringing contents of Haskell modules into scope, a process that is necessary for evaluating/interpreting bits of code on-the-fly. We will walk through some of the caveats of this process.
|
|||
|
|
|||
|
## Interpreted modules
|
|||
|
|
|||
|
Imagine the following situation: we have a Haskell source file with code we want to load dynamically and evaluate. That is a basic task in the GHC API terms, but there are actually different ways of doing that.
|
|||
|
|
|||
|
Let us have a file 'test.hs' containing the code we want to access:
|
|||
|
|
|||
|
```haskell
|
|||
|
module Test (test) where
|
|||
|
test :: Int
|
|||
|
test = 123
|
|||
|
```
|
|||
|
|
|||
|
The basic way to get the 'test' data would be to load `Test` as an interpreted module:
|
|||
|
|
|||
|
```haskell
|
|||
|
import Control.Applicative
|
|||
|
import DynFlags
|
|||
|
import GHC
|
|||
|
import GHC.Paths
|
|||
|
import MonadUtils (liftIO)
|
|||
|
import Unsafe.Coerce
|
|||
|
|
|||
|
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
|||
|
runGhc (Just libdir) $ do
|
|||
|
-- we have to call 'setSessionDynFlags' before doing
|
|||
|
-- everything else
|
|||
|
dflags <- getSessionDynFlags
|
|||
|
-- If we want to make GHC interpret our code on the fly, we
|
|||
|
-- ought to set those two flags, otherwise we
|
|||
|
-- wouldn't be able to use 'setContext' below
|
|||
|
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory
|
|||
|
}
|
|||
|
setTargets =<< sequence [guessTarget "test.hs" Nothing]
|
|||
|
load LoadAllTargets
|
|||
|
-- Bringing the module into the context
|
|||
|
setContext [IIModule $ mkModuleName "Test"]
|
|||
|
-- evaluating and running an action
|
|||
|
act <- unsafeCoerce <$> compileExpr "print test"
|
|||
|
liftIO act
|
|||
|
```
|
|||
|
|
|||
|
The reason that we have to use `HscInterpreted` and `LinkInMemory` is that otherwise it would compile test.hs in the current directory and leave test.hi and test.o files, which we would not be able to load in the interpreted mode. However, `setContext`, will try to bring the code in those files first, when looking for the module `Test`.
|
|||
|
|
|||
|
$ ghc -package ghc --make Interp.hs
|
|||
|
[1 of 1] Compiling Main ( Interp.hs, Interp.o )
|
|||
|
Linking Interp ...
|
|||
|
$ ./Interp
|
|||
|
123
|
|||
|
|
|||
|
Yay, it works! But let's try something fancier like printing a list of integers, one-by-one. For that, we will use an awesome `forM_` function.
|
|||
|
|
|||
|
--- the rest of the file is the same
|
|||
|
--- ...
|
|||
|
-- evaluating and running an action
|
|||
|
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
|||
|
liftIO act
|
|||
|
|
|||
|
|
|||
|
When we try to run it:
|
|||
|
|
|||
|
$ ./Interp
|
|||
|
Interp: panic! (the 'impossible' happened)
|
|||
|
(GHC version 7.8.2 for x86_64-apple-darwin):
|
|||
|
Not in scope: ‘forM_’
|
|||
|
|
|||
|
Well, fair enough, where can we expect GHC to get the `forM_` from? We have to bring `Control.Monad` into the scope in order to do that.
|
|||
|
|
|||
|
This brings us to the next section
|
|||
|
|
|||
|
## Package modules
|
|||
|
|
|||
|
Naively, we might want to load `Control.Monad` in a similar fashion as we did with loading `test.hs`.
|
|||
|
|
|||
|
```haskell
|
|||
|
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
|||
|
runGhc (Just libdir) $ do
|
|||
|
dflags <- getSessionDynFlags
|
|||
|
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory
|
|||
|
}
|
|||
|
setTargets =<< sequence [ guessTarget "test.hs" Nothing
|
|||
|
, guessTarget "Control.Monad" Nothing]
|
|||
|
load LoadAllTargets
|
|||
|
-- Bringing the module into the context
|
|||
|
setContext [IIModule $ mkModuleName "Test"]
|
|||
|
|
|||
|
-- evaluating and running an action
|
|||
|
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
|||
|
liftIO act
|
|||
|
```
|
|||
|
|
|||
|
Unfortunately, this attempt fails:
|
|||
|
|
|||
|
Interp: panic! (the 'impossible' happened)
|
|||
|
(GHC version 7.8.2 for x86_64-apple-darwin):
|
|||
|
module ‘Control.Monad’ is a package module
|
|||
|
|
|||
|
Huh, what? I thought `guessTarget` works on all kinds of modules.
|
|||
|
|
|||
|
Well, it does. But it doesn't "load the module", it merely sets it as the *target for compilation*. Basically, it (together with `load LoadAllTargets`) is equivalent to calling `ghc --make`. And surely it doesn't make much sense trying to `ghc --make Control.Monad` when `Control.Monad` is a module from the base package. What we need to do instead is to bring the compiled `Control.Monad` module into scope. Luckily it's not very hard to do with the help of the `simpleImportDecl :: ModuleName -> ImportDecl name` function:
|
|||
|
|
|||
|
```haskell
|
|||
|
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
|||
|
runGhc (Just libdir) $ do
|
|||
|
-- we have to call 'setSessionDynFlags' before doing
|
|||
|
-- everything else
|
|||
|
dflags <- getSessionDynFlags
|
|||
|
-- If we want to make GHC interpret our code on the fly, we
|
|||
|
-- ought to set those two flags, otherwise we
|
|||
|
-- wouldn't be able to use 'setContext' below
|
|||
|
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory
|
|||
|
}
|
|||
|
setTargets =<< sequence [ guessTarget "test.hs" Nothing ]
|
|||
|
load LoadAllTargets
|
|||
|
-- Bringing the module into the context
|
|||
|
setContext [ IIModule $ mkModuleName "Test"
|
|||
|
, IIDecl . simpleImportDecl
|
|||
|
. mkModuleName $ "Control.Monad" ]
|
|||
|
-- evaluating and running an action
|
|||
|
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
|
|||
|
liftIO act
|
|||
|
```
|
|||
|
|
|||
|
And this works like a charm:
|
|||
|
|
|||
|
$ ./Interp
|
|||
|
1
|
|||
|
2
|
|||
|
123
|
|||
|
|
|||
|
|
|||
|
## Compiled modules
|
|||
|
|
|||
|
What we have implemented so far corresponds to the `:load* test.hs`
|
|||
|
command in GHCi, which gives us the full access to the source code of
|
|||
|
the program. To illustrate this let's modify our test file:
|
|||
|
|
|||
|
```haskell
|
|||
|
module Test (test) where
|
|||
|
|
|||
|
test :: Int
|
|||
|
test = 123
|
|||
|
|
|||
|
test2 :: String
|
|||
|
test2 = "Hi"
|
|||
|
```
|
|||
|
|
|||
|
Now, if we want to load that file as an interpreted module and evaluate
|
|||
|
`test2` nothing will stop us from doing so.
|
|||
|
|
|||
|
$ ./Interp2
|
|||
|
(123,"Hi")
|
|||
|
|
|||
|
|
|||
|
If we want to use the compiled module (like `:load test.hs` in GHCi),
|
|||
|
we have to bring `Test` into the context the same way we dealt with
|
|||
|
`Control.Monad`:
|
|||
|
|
|||
|
```haskell
|
|||
|
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
|
|||
|
runGhc (Just libdir) $ do
|
|||
|
dflags <- getSessionDynFlags
|
|||
|
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory
|
|||
|
}
|
|||
|
setTargets =<< sequence [ guessTarget "Test" Nothing ]
|
|||
|
load LoadAllTargets
|
|||
|
-- Bringing the module into the context
|
|||
|
setContext [ IIDecl $ simpleImportDecl (mkModuleName "Test")
|
|||
|
, IIDecl $ simpleImportDecl (mkModuleName "Prelude")
|
|||
|
]
|
|||
|
printExpr "test"
|
|||
|
printExpr "test2"
|
|||
|
|
|||
|
|
|||
|
printExpr :: String -> Ghc ()
|
|||
|
printExpr expr = do
|
|||
|
liftIO $ putStrLn ("-- Going to print " ++ expr)
|
|||
|
act <- unsafeCoerce <$> compileExpr ("print (" ++ expr ++ ")")
|
|||
|
liftIO act
|
|||
|
```
|
|||
|
|
|||
|
The output:
|
|||
|
|
|||
|
|
|||
|
$ ./Interp2
|
|||
|
-- Going to print test
|
|||
|
123
|
|||
|
-- Going to print test2
|
|||
|
target: panic! (the 'impossible' happened)
|
|||
|
(GHC version 7.6.3 for x86_64-apple-darwin):
|
|||
|
Not in scope: `test2'
|
|||
|
Perhaps you meant `test' (imported from Test)
|
|||
|
|
|||
|
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
|
|||
|
|
|||
|
# Error handling
|
|||
|
|
|||
|
Our exercises above produced a number of GHC panics/erros. By default
|
|||
|
you can expect GHC to spew all the errors onto your screen, but for
|
|||
|
different purposes you might want to, e.g. log them.
|
|||
|
|
|||
|
Naturally at first I tried the exception handling mechanism:
|
|||
|
|
|||
|
```haskell
|
|||
|
-- Main.hs:
|
|||
|
import GHC
|
|||
|
import GHC.Paths
|
|||
|
import MonadUtils
|
|||
|
import Exception
|
|||
|
import Panic
|
|||
|
import Unsafe.Coerce
|
|||
|
import System.IO.Unsafe
|
|||
|
|
|||
|
-- I thought this code would handle the exception
|
|||
|
handleException :: (ExceptionMonad m, MonadIO m)
|
|||
|
=> m a -> m (Either String a)
|
|||
|
handleException m =
|
|||
|
ghandle (\(ex :: SomeException) -> return (Left (show ex))) $
|
|||
|
handleGhcException (\ge -> return (Left (showGhcException ge ""))) $
|
|||
|
flip gfinally (liftIO restoreHandlers) $
|
|||
|
m >>= return . Right
|
|||
|
|
|||
|
-- Initializations, needed if you want to compile code on the fly
|
|||
|
initGhc :: Ghc ()
|
|||
|
initGhc = do
|
|||
|
dfs <- getSessionDynFlags
|
|||
|
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory }
|
|||
|
return ()
|
|||
|
|
|||
|
|
|||
|
-- main entry point
|
|||
|
main = test >>= print
|
|||
|
|
|||
|
test :: IO (Either String Int)
|
|||
|
test = handleException $ runGhc (Just libdir) $ do
|
|||
|
initGhc
|
|||
|
setTargets =<< sequence [ guessTarget "file1.hs" Nothing ]
|
|||
|
graph <- depanal [] False
|
|||
|
loaded <- load LoadAllTargets
|
|||
|
-- when (failed loaded) $ throw LoadingException
|
|||
|
setContext (map (IIModule . moduleName . ms_mod) graph)
|
|||
|
let expr = "run"
|
|||
|
res <- unsafePerformIO . unsafeCoerce <$> compileExpr expr
|
|||
|
return res
|
|||
|
|
|||
|
|
|||
|
-- file1.hs:
|
|||
|
module Main where
|
|||
|
|
|||
|
main = return ()
|
|||
|
|
|||
|
run :: IO Int
|
|||
|
run = do
|
|||
|
n <- x
|
|||
|
return (n+1)
|
|||
|
```
|
|||
|
|
|||
|
The problem is when I run the 'test' function above I receive the
|
|||
|
following output:
|
|||
|
|
|||
|
h> test
|
|||
|
|
|||
|
test/file1.hs:4:10: Not in scope: `x'
|
|||
|
|
|||
|
Left "Cannot add module Main to context: not a home module"
|
|||
|
it :: Either String Int
|
|||
|
|
|||
|
|
|||
|
It appears that the exception handler did cat *an* error, but a
|
|||
|
peculiar one, and not that one I wanted to catch.
|
|||
|
|
|||
|
## Solution
|
|||
|
|
|||
|
We've studied the GHC API together with Luite Stegeman and I think
|
|||
|
we've found a more or less satisfactory solution.
|
|||
|
|
|||
|
Errors are handled using the
|
|||
|
[LogAction](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:LogAction)
|
|||
|
specified in the
|
|||
|
[DynFlags](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:DynFlags)
|
|||
|
for your GHC session. So to fix this you need to change 'log_action'
|
|||
|
parameter in dynFlags. For example, you can do this:
|
|||
|
|
|||
|
```haskell
|
|||
|
initGhc = do
|
|||
|
..
|
|||
|
ref <- liftIO $ newIORef ""
|
|||
|
dfs <- getSessionDynFlags
|
|||
|
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
|
|||
|
, ghcLink = LinkInMemory
|
|||
|
, log_action = logHandler ref -- ^ this
|
|||
|
}
|
|||
|
|
|||
|
-- LogAction == DynFlags -> Severity -> SrcSpan -> PprStyle -> MsgDoc -> IO ()
|
|||
|
logHandler :: IORef String -> LogAction
|
|||
|
logHandler ref dflags severity srcSpan style msg =
|
|||
|
case severity of
|
|||
|
SevError -> modifyIORef' ref (++ printDoc)
|
|||
|
SevFatal -> modifyIORef' ref (++ printDoc)
|
|||
|
_ -> return () -- ignore the rest
|
|||
|
where cntx = initSDocContext dflags style
|
|||
|
locMsg = mkLocMessage severity srcSpan msg
|
|||
|
printDoc = show (runSDoc locMsg cntx)
|
|||
|
```
|
|||
|
|
|||
|
# Package databases
|
|||
|
|
|||
|
A *package database* is a directory where the information about your
|
|||
|
installed packages is stored. For each package registered in the
|
|||
|
database there is a .conf file with the package details. The .conf
|
|||
|
file contains the package description (just like in the .cabal file)
|
|||
|
as well as path to binaries and a list of resolved dependencies:
|
|||
|
|
|||
|
$ cat aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a.conf
|
|||
|
name: aeson
|
|||
|
version: 0.6.1.0.1
|
|||
|
id: aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a
|
|||
|
<..snip..>
|
|||
|
import-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
|
|||
|
library-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
|
|||
|
<..snip..>
|
|||
|
depends: attoparsec-0.10.4.0-acffb7126aca47a107cf7722d75f1f5e
|
|||
|
base-4.7.0.0-b67b4d8660168c197a2f385a9347434d
|
|||
|
blaze-builder-0.3.1.1-9fd49ac1608ca25e284a8ac6908d5148
|
|||
|
bytestring-0.10.3.0-66e3f5813c3dc8ef9647156d1743f0ef
|
|||
|
<..snip..>
|
|||
|
|
|||
|
You can use `ghc-pkg` to manage installed packages on your system. For
|
|||
|
example, to list all the packages you've installed run `ghc-pkg list`.
|
|||
|
To list all the package databases that are automatically picked up by
|
|||
|
`ghc-pkg` do the following:
|
|||
|
|
|||
|
$ ghc-pkg nonexistentpkg
|
|||
|
/home/dan/ghc/lib/ghc-7.7.20130722/package.conf.d
|
|||
|
/home/dan/.ghc/i386-linux-7.7.20130722/package.conf.d
|
|||
|
|
|||
|
See `ghc-pkg --help` or the [online documentation](http://www.haskell.org/ghc/docs/latest/html/users_guide/packages.html#package-management) for more details.
|
|||
|
|
|||
|
## Adding a package db
|
|||
|
|
|||
|
By default GHC knows only about two package databases: the global
|
|||
|
package database (usually `/usr/lib/ghc-something/` on Linux) and the
|
|||
|
user-specific database (usually `~/.ghc/lib`). In order to pick up a
|
|||
|
package that resides in a different package database you have to
|
|||
|
employ some tricks.
|
|||
|
|
|||
|
For some reason GHC API does not export an clear and easy-to-use
|
|||
|
function that would allow you to do that, although the code we need is
|
|||
|
present in the GHC sources.
|
|||
|
|
|||
|
The way this whole thing works is the following:
|
|||
|
|
|||
|
1. GHC calls [initPackages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:initPackages),
|
|||
|
which reads the database files and sets up the [internal
|
|||
|
table](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#t:PackageState)
|
|||
|
of package information
|
|||
|
|
|||
|
2. The reading of package databases is performed via the
|
|||
|
[readPackageConfigs](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html#v:readPackageConfigs)
|
|||
|
function. It reads the user package database, the global package
|
|||
|
database, the "GHC_PACKAGE_PATH" environment variable, and *applies
|
|||
|
the extraPkgConfs function*, which is a dynflag and has the following
|
|||
|
type: `extraPkgConfs :: [PkgConfRef] -> [PkgConfRef]`
|
|||
|
([PkgConfRef](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/DynFlags.html#t:PkgConfRef)
|
|||
|
is a type representing the package database). The `extraPkgConf` flag
|
|||
|
is supposed to represent the `-package-db` command line option.
|
|||
|
|
|||
|
3. Once the database is parsed, the loaded packages are stored in the
|
|||
|
`pkgDatabase` dynflag which is a list of [PackageConfig](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/PackageConfig.html#t:PackageConfig)s
|
|||
|
|
|||
|
|
|||
|
So, in order to add a package database to the current session we have
|
|||
|
to simply modify the `extraPkgConfs` dynflag. Actually, there is
|
|||
|
already a function [present](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/src/DynFlags.html#line-3656) in the GHC source that does exactly what we
|
|||
|
need: `addPkgConfRef :: PkgConfRef -> DynP ()`. Unfortunately it's not
|
|||
|
exported so we can't use it in our own code. I rolled my own functions
|
|||
|
that I am using in the interactive-diagrams project, feel free to copy
|
|||
|
them:
|
|||
|
|
|||
|
```haskell
|
|||
|
-- | Add a package database to the Ghc monad
|
|||
|
#if __GLASGOW_HASKELL_ >= 707
|
|||
|
addPkgDb :: GhcMonad m => FilePath -> m ()
|
|||
|
#else
|
|||
|
addPkgDb :: (MonadIO m, GhcMonad m) => FilePath -> m ()
|
|||
|
#endif
|
|||
|
addPkgDb fp = do
|
|||
|
dfs <- getSessionDynFlags
|
|||
|
let pkg = PkgConfFile fp
|
|||
|
let dfs' = dfs { extraPkgConfs = (pkg:) . extraPkgConfs dfs }
|
|||
|
setSessionDynFlags dfs'
|
|||
|
#if __GLASGOW_HASKELL_ >= 707
|
|||
|
_ <- initPackages dfs'
|
|||
|
#else
|
|||
|
_ <- liftIO $ initPackages dfs'
|
|||
|
#endif
|
|||
|
return ()
|
|||
|
|
|||
|
-- | Add a list of package databases to the Ghc monad
|
|||
|
-- This should be equivalen to
|
|||
|
-- > addPkgDbs ls = mapM_ addPkgDb ls
|
|||
|
-- but it is actaully faster, because it does the package
|
|||
|
-- reintialization after adding all the databases
|
|||
|
#if __GLASGOW_HASKELL_ >= 707
|
|||
|
addPkgDbs :: GhcMonad m => [FilePath] -> m ()
|
|||
|
#else
|
|||
|
addPkgDbs :: (MonadIO m, GhcMonad m) => [FilePath] -> m ()
|
|||
|
#endif
|
|||
|
addPkgDbs fps = do
|
|||
|
dfs <- getSessionDynFlags
|
|||
|
let pkgs = map PkgConfFile fps
|
|||
|
let dfs' = dfs { extraPkgConfs = (pkgs ++) . extraPkgConfs dfs }
|
|||
|
setSessionDynFlags dfs'
|
|||
|
#if __GLASGOW_HASKELL_ >= 707
|
|||
|
_ <- initPackages dfs'
|
|||
|
#else
|
|||
|
_ <- liftIO $ initPackages dfs'
|
|||
|
#endif
|
|||
|
return ()
|
|||
|
```
|
|||
|
|
|||
|
## Links
|
|||
|
|
|||
|
- [Packages](https://downloads.haskell.org/~ghc/latest/docs/html/libraries/ghc-7.10.2/Packages.html) module,
|
|||
|
contains other functions that modify/make use of `extraPkgConfs`
|