Streamly.Data.ParserK

See the general notes about parsing in the Streamly.Data.Parser module. This module implements a using Continuation Passing Style (CPS) wrapper over the Streamly.Data.Parser module. It is as fast or faster than attoparsec.

Streamly parsers support all operations offered by popular Haskell parser libraries. They operate on a generic input type, support streaming, and are faster.

The ParserK type represents a stream-consumer as a composition of function calls, therefore, a function call overhead is incurred at each composition. It is reasonably fast in general but may be a few times slower than a fused parser represented by the Parser type. However, it allows for scalable dynamic composition, especially, ParserK can be used in recursive calls. Operations like splitWith on ParserK type have linear (O(n)) performance with respect to the number of compositions.

ParserK is preferred over Parser when extensive applicative, alternative and monadic composition is required, or when recursive or dynamic composition of parsers is required. ParserK also allows efficient parsing of a stream of arrays, it can also break the input stream into a parse result and remaining stream so that the stream can be parsed independently in segments.

Using ParserK

All the parsers from the Streamly.Data.Parser module can be adapted to ParserK using the adaptC, adapt, and adaptCG combinators.

parseChunks runs a parser on a stream of unboxed arrays, this is the preferred and most efficient way to parse chunked input. The more general parseBreakChunks function returns the remaining stream as well along with the parse result. There are parseChunksGeneric, parseBreakChunksGeneric as well to run parsers on boxed arrays. parse, parseBreak run parsers on a stream of individual elements instead of stream of arrays.

Monadic Composition

Monad composition can be used for lookbehind parsers, we can dynamically compose new parsers based on the results of the previously parsed values.

If we have to parse "a9" or "9a" but not "99" or "aa" we can use the following non-monadic, backtracking parser:

>>> digits p1 p2 = ((:) <$> p1 <*> ((:) <$> p2 <*> pure []))
>>> :{
backtracking :: Monad m => ParserK Char m String
backtracking = ParserK.adapt $
    digits (Parser.satisfy isDigit) (Parser.satisfy isAlpha)
    <|>
    digits (Parser.satisfy isAlpha) (Parser.satisfy isDigit)
:}

We know that if the first parse resulted in a digit at the first place then the second parse is going to fail. However, we waste that information and parse the first character again in the second parse only to know that it is not an alphabetic char. By using lookbehind in a Monad composition we can avoid redundant work:

>>> data DigitOrAlpha = Digit Char | Alpha Char

>>> :{
lookbehind :: Monad m => ParserK Char m String
lookbehind = do
    x1 <- ParserK.adapt $
             Digit <$> Parser.satisfy isDigit
         <|> Alpha <$> Parser.satisfy isAlpha
    -- Note: the parse depends on what we parsed already
    x2 <- ParserK.adapt $
          case x1 of
             Digit _ -> Parser.satisfy isAlpha
             Alpha _ -> Parser.satisfy isDigit
    return $ case x1 of
        Digit x -> [x,x2]
        Alpha x -> [x,x2]
:}

Experimental APIs

Please refer to Streamly.Internal.Data.ParserK for functions that have not yet been released.

Setup

To execute the code examples provided in this module in ghci, please run the following commands first.

>>> :m
>>> import Control.Applicative ((<|>))
>>> import Data.Char (isDigit, isAlpha)

>>> import Streamly.Data.Parser (Parser)
>>> import Streamly.Data.ParserK (ParserK)

>>> import qualified Streamly.Data.Parser as Parser
>>> import qualified Streamly.Data.ParserK as ParserK

For APIs that have not been released yet.

>>> import qualified Streamly.Internal.Data.ParserK as ParserK

Parser Type

data ParserK a m b Source #

A continuation passing style parser representation. A continuation of Steps, each step passes a state and a parse result to the next Step. The resulting Step may carry a continuation that consumes input a and results in another Step. Essentially, the continuation may either consume input without a result or return a result with no further input to be consumed.

Instances

Instances details

Monad m => MonadFail (ParserK a m) Source #
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods fail :: String -> ParserK a m a0 Source #
MonadIO m => MonadIO (ParserK a m) Source #
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods liftIO :: IO a0 -> ParserK a m a0 Source #
Monad m => Alternative (ParserK a m) Source #	`p1 <\|> p2` passes the input to parser p1, if it succeeds, the result is returned. However, if p1 fails, the parser driver backtracks and tries the same input on the alternative parser p2, returning the result if it succeeds.
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods empty :: ParserK a m a0 Source # (<\|>) :: ParserK a m a0 -> ParserK a m a0 -> ParserK a m a0 Source # some :: ParserK a m a0 -> ParserK a m [a0] Source # many :: ParserK a m a0 -> ParserK a m [a0] Source #
Monad m => Applicative (ParserK a m) Source #	`f <$> p1 <*> p2` applies parsers p1 and p2 sequentially to an input stream. The first parser runs and processes the input, the remaining input is then passed to the second parser. If both parsers succeed, their outputs are applied to the function `f`. If either parser fails, the operation fails.
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods pure :: a0 -> ParserK a m a0 Source # (<>) :: ParserK a m (a0 -> b) -> ParserK a m a0 -> ParserK a m b Source # liftA2 :: (a0 -> b -> c) -> ParserK a m a0 -> ParserK a m b -> ParserK a m c Source # (>) :: ParserK a m a0 -> ParserK a m b -> ParserK a m b Source # (<*) :: ParserK a m a0 -> ParserK a m b -> ParserK a m a0 Source #
Functor m => Functor (ParserK a m) Source #	Map a function on the result i.e. on `b` in `Parser a m b`.
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods fmap :: (a0 -> b) -> ParserK a m a0 -> ParserK a m b Source # (<$) :: a0 -> ParserK a m b -> ParserK a m a0 Source #
Monad m => Monad (ParserK a m) Source #	Monad composition can be used for lookbehind parsers, we can dynamically compose new parsers based on the results of the previously parsed values.
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods (>>=) :: ParserK a m a0 -> (a0 -> ParserK a m b) -> ParserK a m b Source # (>>) :: ParserK a m a0 -> ParserK a m b -> ParserK a m b Source # return :: a0 -> ParserK a m a0 Source #
Monad m => MonadPlus (ParserK a m) Source #	`mzero` is same as `empty`, it aborts the parser. `mplus` is same as `<\|>`, it selects the first succeeding parser.
Instance details Defined in Streamly.Internal.Data.ParserK.Type Methods mzero :: ParserK a m a0 Source # mplus :: ParserK a m a0 -> ParserK a m a0 -> ParserK a m a0 Source #