CHXHtml (Compliant Haskell XHTML) is a Haskell library for static or dynamic (X)HTML production. What sets it apart from all other production libraries:
Currently tested on GHC 6.10.3 & 7.0.3 w/ CHXhtml 0.2.0
$ cabal install CHXHtml
Simple html-like syntax. If you know even a little Haskell, the syntax is easy to pick up, no fancy combinators or monads, just nested data structures. For example...
import Text.CHXHtml.XHtml1_strict helloWorld = _html [ _head [ _title [pcdata "Hello World Page"] ], _body [_h1 [pcdata "Hello World"], _hr, _div [a_ [href_att "http://google.com"] [pcdata "Search!"]] ] ]
creates the structure. To render it to a string use the render function (putStrLn used to show newline)...
ghci>putStrLn (render helloWorld) <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><meta http-equiv="Content Type" content="text/html;charset=UTF-8" /> <title>Hello World Page</title> </head> <body><h1>Hello World</h1> <div><a href="http://google.com">Search!</a> </div> </body> </html>
Each tag is represented by a function (2 actually) which produce the appropriate constructor for that context. The _<tag>
function
does not allow attributes, while the <tag>_
does. Non-attribute versions accepts a list of children tags, which may have children
themselves as long as the DTD allows it. Attribute versions accept a list of attributes and values before the list of children.
In general there are nine functions, with type signatures:
_<tag> :: [Ent] -> Ent <tag>_ :: [Attribute] [Ent] -> Ent render :: Ent -> String render_bs :: Ent -> Data.ByteString pcdata :: String -> Ent pcdata_bs :: Data.ByteString -> Ent <attribute>_att :: String -> Attribute <attribute>_att_bs :: Data.ByteString -> Ent htmlHelp :: [String] -> [[String]]
Where <tag>
is any tag, and <attribute>
is any attribute name.
BUT, only when the DTD allows it! Yes, Ent and Attribute are gross simplifications, but
you get the idea. Nasty compilation type errors will occur if you attempt a non-valid structure. pcdata
provides a way to get regular
text in, which is HTML escaped for safety. Use pcdata_bs
for Javascript, it isn't escaped.
Again, only when allowed.
So you may be wondering if you have to constantly look at the DTD to find what tags and attributes are allowed. Fear no more, htmlHelp to the rescue!
Provide a list of parent tags ([String]), and htmlHelp gives a list of allowed children and attributes, or an error. For example had I wanted to know what
children or attributes the inner div
takes in the example above I would run the following in the interpreter after loading the library:
*Main> htmlHelp ["html","body","div"] [["a","abbr","acronym","address","b","bdo","big","blockquote","br","button","cite","code","del","dfn", "div","dl","em","fieldset","form","h1","h2","h3","h4","h5","h6","hr","i","img","input","ins","kbd", "label","map","noscript","object","ol","p","pcdata","pre","q","samp","script","select","small","span", "strong","sub","sup","table","textarea","tt","ul","var"], ["class_att","dir_att","id_att","lang_att", "onclick_att","ondblclick_att","onkeydown_att","onkeypress_att","onkeyup_att","onmousedown_att", "onmousemove_att","onmouseout_att","onmouseover_att","onmouseup_att","style_att","title_att"]]
So this nesting is valid, and I get possible childrent tags and node attributes. Note that all queries should start with html
.
To play with htmlHelp
without touching Haskell, visit htmlHelp for an online interface to the function, which is running the actual htmlHelp function!
Strings are painfully slow in Haskell due to their linked-list nature. We use Data.ByteString.Lazy
internally to efficiently store and manipulate
character data. Thus, we expose this data via the _bs
suffix to our input and output functions. As of right now pcdata_bs
does NOT
HTML escape the character data unlike the pcdata
version. We trust you to respect this feature.
Memoization tricks to come......