Streamly.Internal.Data.Scanl
Left scans.
Scanl vs Fold
Folds and scans both are consumers of streams. A left scan is a generalization of a fold. While the output of a fold is a singleton value, the output of a scan is a stream. A fold is equivalent to a left scan which produces only the final value in the output stream.
Like folds, a scan has an internal state. Unlike a fold, a scan produces an output on each input, the output is a function of the scan state and the input.
A Scanl m a b
can represent a Fold m a b
by discarding the intermediate
outputs and keeping only the final output of the scan.
Since folds do not care about intermediate values, we do not need the extract function for folds. Because folds do not have a requirement for intermediate values, they can be used for implementing combinators like splitWith where intermediate values are not meaningful and are expensive to compute. Folds provide an applicative and monad behavior to consume the stream in parts and compose the folded results. Scans provide Category like composition and stream zip applicative behavior. The finalization function of a fold would return a single value whereas for scan it may be a stream draining the scan buffer. For these reasons, scans and folds are required as independent abstractions.
Scanl vs Pipe
A scan is a simpler version of the consumer side of pipes. A left scan always produces an output whereas a pipe has an additional ability to skip output. Scans are simpler abstractions to think about compared to pipes and easier for the compiler to optimize and fuse.
Compositions
Scans can be chained in the same way as function composition (Category) and can distribute input (tee Applicative). Folds provide an applicative and monad behavior to consume the stream in parts and compose the folded results. Folds are also a special case of parsers.
Imports
>>>
:m
>>>
:set -XFlexibleContexts
>>>
import Control.Monad (void)
>>>
import qualified Data.Foldable as Foldable
>>>
import Data.Bifunctor(bimap)
>>>
import Data.Function ((&))
>>>
import Data.Functor.Identity (Identity, runIdentity)
>>>
import Data.IORef (newIORef, readIORef, writeIORef)
>>>
import Data.Maybe (fromJust, isJust)
>>>
import Data.Monoid (Endo(..), Last(..), Sum(..))
>>>
import Streamly.Data.Array (Array)
>>>
import Streamly.Data.Fold (Fold, Tee(..))
>>>
import Streamly.Data.Stream (Stream)
>>>
import qualified Data.Map as Map
>>>
import qualified Data.Set as Set
>>>
import qualified Data.IntSet as IntSet
>>>
import qualified Streamly.Data.Array as Array
>>>
import qualified Streamly.Data.Fold as Fold
>>>
import qualified Streamly.Data.MutArray as MutArray
>>>
import qualified Streamly.Data.Parser as Parser
>>>
import qualified Streamly.Data.Stream as Stream
>>>
import qualified Streamly.Data.StreamK as StreamK
>>>
import qualified Streamly.Data.Unfold as Unfold
For APIs that have not been released yet.
>>>
import qualified Streamly.Internal.Data.Fold as Fold
>>>
import qualified Streamly.Internal.Data.Scanl as Scanl
>>>
import qualified Streamly.Internal.Data.Stream as Stream
Step Type
Represents the result of the step
of a Fold
. Partial
returns an
intermediate state of the fold, the fold step can be called again with the
state or the driver can use extract
on the state to get the result out.
Done
returns the final result and the fold cannot be driven further.
Pre-release
mapMStep :: Applicative m => (a -> m b) -> Step s a -> m (Step s b) Source #
Map a monadic function over the result b
in Step s b
.
Internal
chainStepM :: Applicative m => (s1 -> m s2) -> (a -> m (Step s2 b)) -> Step s1 a -> m (Step s2 b) Source #
Fold Type
The type Fold m a b
represents a consumer of an input stream of values
of type a
and returning a final value of type b
in Monad
m
. The
constructor of a fold is Fold step initial extract final
.
The fold uses an internal state of type s
. The initial value of the state
s
is created by initial
. This function is called once and only once
before the fold starts consuming input. Any resource allocation can be done
in this function.
The step
function is called on each input, it consumes an input and
returns the next intermediate state (see Step
) or the final result b
if
the fold terminates.
If the fold is used as a scan, the extract
function is used by the scan
driver to map the current state s
of the fold to the fold result. Thus
extract
can be called multiple times. In some folds, where scanning does
not make sense, this function is left unimplemented; such folds cannot be
used as scans.
Before a fold terminates, final
is called once and only once (unless the
fold terminated in initial
itself). Any resources allocated by initial
can be released in final
. In folds that do not require any cleanup
extract
and final
are typically the same.
When implementing fold combinators, care should be taken to cleanup any
state of the argument folds held by the fold by calling the respective
final
at all exit points of the fold. Also, final
should not be called
more than once. Note that if a fold terminates by Done
constructor, there
is no state to cleanup.
NOTE: The constructor is not yet released, smart constructors are provided to create folds.
Constructors
forall s. Scanl (s -> a -> m (Step s b)) (m (Step s b)) (s -> m b) (s -> m b) |
|
Instances
Monad m => Applicative (Scanl m a) Source # | |
Defined in Streamly.Internal.Data.Scanl.Type Methods pure :: a0 -> Scanl m a a0 Source # (<*>) :: Scanl m a (a0 -> b) -> Scanl m a a0 -> Scanl m a b Source # liftA2 :: (a0 -> b -> c) -> Scanl m a a0 -> Scanl m a b -> Scanl m a c Source # (*>) :: Scanl m a a0 -> Scanl m a b -> Scanl m a b Source # (<*) :: Scanl m a a0 -> Scanl m a b -> Scanl m a a0 Source # | |
Functor m => Functor (Scanl m a) Source # | Maps a function on the output of the fold (the type |
Constructors
mkScanlM :: Monad m => (b -> a -> m b) -> m b -> Scanl m a b Source #
Make a scan from a left fold style monadic step function and initial value of the accumulator.
mkScanl1 :: Monad m => (a -> a -> a) -> Scanl m a (Maybe a) Source #
Make a strict left scan, for non-empty streams, using first element as the starting value. Returns Nothing if the stream is empty.
Pre-release
mkScanl1M :: Monad m => (a -> a -> m a) -> Scanl m a (Maybe a) Source #
Like mkScanl1
but with a monadic step function.
Pre-release
mkScant :: Monad m => (s -> a -> Step s b) -> Step s b -> (s -> b) -> Scanl m a b Source #
Make a terminating fold using a pure step function, a pure initial state and a pure state extraction function.
Pre-release
mkScantM :: (s -> a -> m (Step s b)) -> m (Step s b) -> (s -> m b) -> Scanl m a b Source #
Make a terminating fold with an effectful step function and initial state, and a state extraction function.
>>>
mkScantM = Scanl.Scanl
We can just use Scanl
but it is provided for completeness.
Pre-release
mkScanr :: Monad m => (a -> b -> b) -> b -> Scanl m a b Source #
Make a scan using a right fold style step function and a terminal value. It performs a strict right fold via a left fold using function composition. Note that a strict right fold can only be useful for constructing strict structures in memory. For reductions this will be very inefficient.
Definitions:
>>>
mkScanr f z = fmap (flip appEndo z) $ Scanl.foldMap (Endo . f)
>>>
mkScanr f z = fmap ($ z) $ Scanl.mkScanl (\g x -> g . f x) id
Example:
>>>
Stream.toList $ Stream.scanl (Scanl.mkScanr (:) []) $ Stream.enumerateFromTo 1 5
[[],[1],[1,2],[1,2,3],[1,2,3,4],[1,2,3,4,5]]
mkScanrM :: Monad m => (a -> b -> m b) -> m b -> Scanl m a b Source #
Like mkScanr but with a monadic step function.
Example:
>>>
toList = Scanl.mkScanrM (\a xs -> return $ a : xs) (return [])
Pre-release
Scans
const :: Applicative m => b -> Scanl m a b Source #
Make a scan that yields the supplied value on any input.
Pre-release
constM :: Applicative m => m b -> Scanl m a b Source #
Make a scan that runs the supplied effect once and then yields the result on any input.
Pre-release
fromRefold :: Refold m c a b -> c -> Scanl m a b Source #
Make a fold from a consumer.
Internal
drain :: Monad m => Scanl m a () Source #
A scan that drains all its input, running the effects and discarding the results.
>>>
drain = Scanl.drainMapM (const (return ()))
>>>
drain = Scanl.mkScanl (\_ _ -> ()) ()
latest :: Monad m => Scanl m a (Maybe a) Source #
Returns the latest element of the input stream, if any.
>>>
latest = Scanl.mkScanl1 (\_ x -> x)
>>>
latest = fmap getLast $ Scanl.foldMap (Last . Just)
functionM :: Monad m => (a -> m (Maybe b)) -> Scanl m a (Maybe b) Source #
Lift a Maybe returning function to a scan.
toList :: Monad m => Scanl m a [a] Source #
Scans the input stream building a list.
Warning! working on large lists accumulated as buffers in memory could be very inefficient, consider using Streamly.Data.Array instead.
>>>
toList = Scanl.mkScanr (:) []
toStreamK :: Monad m => Scanl m a (StreamK n a) Source #
Scans its input building a pure stream.
>>>
toStreamK = fmap StreamK.reverse Scanl.toStreamKRev
Internal
toStreamKRev :: Monad m => Scanl m a (StreamK n a) Source #
Buffers the input stream to a pure stream in the reverse order of the input.
This is more efficient than toStreamK
. toStreamK has exactly the same
performance as reversing the stream after toStreamKRev.
Pre-release
length :: Monad m => Scanl m a Int Source #
Determine the length of the input stream.
Definition:
>>>
length = Scanl.genericLength
>>>
length = fmap getSum $ Scanl.foldMap (Sum . const 1)
maximumBy :: Monad m => (a -> a -> Ordering) -> Scanl m a (Maybe a) Source #
Determine the maximum element in a stream using the supplied comparison function.
maximum :: (Monad m, Ord a) => Scanl m a (Maybe a) Source #
Determine the maximum element in a stream.
Definitions:
>>>
maximum = Scanl.maximumBy compare
>>>
maximum = Scanl.mkScanl1 max
Same as the following but without a default maximum. The Max
Monoid uses
the minBound
as the default maximum:
>>>
maximum = fmap Data.Semigroup.getMax $ Scanl.foldMap Data.Semigroup.Max
minimumBy :: Monad m => (a -> a -> Ordering) -> Scanl m a (Maybe a) Source #
Computes the minimum element with respect to the given comparison function
minimum :: (Monad m, Ord a) => Scanl m a (Maybe a) Source #
Determine the minimum element in a stream using the supplied comparison function.
Definitions:
>>>
minimum = Scanl.minimumBy compare
>>>
minimum = Scanl.mkScanl1 min
Same as the following but without a default minimum. The Min
Monoid uses the
maxBound
as the default maximum:
>>>
maximum = fmap Data.Semigroup.getMin $ Scanl.foldMap Data.Semigroup.Min
rangeBy :: Monad m => (a -> a -> Ordering) -> Scanl m a (Maybe (a, a)) Source #
Find minimum and maximum element using the provided comparison function.
range :: (Monad m, Ord a) => Scanl m a (Maybe (a, a)) Source #
Find minimum and maximum elements i.e. (min, max).
Combinators
Mapping output
rmapM :: Monad m => (b -> m c) -> Scanl m a b -> Scanl m a c Source #
Map a monadic function on the output of a fold.
Mapping Input
lmap :: (a -> b) -> Scanl m b r -> Scanl m a r Source #
lmap f scan
maps the function f
on the input of the scan.
Definition:
>>>
lmap = Scanl.lmapM return
Example:
>>>
sumSquared = Scanl.lmap (\x -> x * x) Scanl.sum
>>>
Stream.toList $ Stream.scanl sumSquared (Stream.enumerateFromTo 1 10)
[0,1,5,14,30,55,91,140,204,285,385]
lmapM :: Monad m => (a -> m b) -> Scanl m b r -> Scanl m a r Source #
lmapM f scan
maps the monadic function f
on the input of the scan.
Filtering
filter :: Monad m => (a -> Bool) -> Scanl m a r -> Scanl m a r Source #
Include only those elements that pass a predicate.
>>>
Stream.toList $ Stream.scanl (Scanl.filter (> 5) Scanl.sum) $ Stream.fromList [1..10]
[0,0,0,0,0,0,6,13,21,30,40]
>>>
filter p = Scanl.postscanlMaybe (Scanl.filtering p)
>>>
filter p = Scanl.filterM (return . p)
>>>
filter p = Scanl.mapMaybe (\x -> if p x then Just x else Nothing)
filtering :: Monad m => (a -> Bool) -> Scanl m a (Maybe a) Source #
A scan for filtering elements based on a predicate.
filterM :: Monad m => (a -> m Bool) -> Scanl m a r -> Scanl m a r Source #
Like filter
but with a monadic predicate.
>>>
f p x = p x >>= \r -> return $ if r then Just x else Nothing
>>>
filterM p = Scanl.mapMaybeM (f p)
catEithers :: Scanl m a b -> Scanl m (Either a a) b Source #
Remove the either wrapper and flatten both lefts and as well as rights in the output stream.
Definition:
>>>
catEithers = Scanl.lmap (either id id)
Pre-release
Trimming
take :: Monad m => Int -> Scanl m a b -> Scanl m a b Source #
Take at most n
input elements and scan them using the supplied scan. A
negative count is treated as 0.
>>>
Stream.toList $ Stream.scanl (Scanl.take 2 Scanl.toList) $ Stream.fromList [1..10]
[[],[1],[1,2]]
takeEndBy_ :: Monad m => (a -> Bool) -> Scanl m a b -> Scanl m a b Source #
Like takeEndBy
but drops the element on which the predicate succeeds.
Example:
>>>
input = Stream.fromList "hello\nthere\n"
>>>
line = Scanl.takeEndBy_ (== '\n') Scanl.toList
>>>
Stream.toList $ Stream.scanl line input
["","h","he","hel","hell","hello","hello"]
takeEndBy :: Monad m => (a -> Bool) -> Scanl m a b -> Scanl m a b Source #
Take the input, stop when the predicate succeeds taking the succeeding element as well.
Example:
>>>
input = Stream.fromList "hello\nthere\n"
>>>
line = Scanl.takeEndBy (== '\n') Scanl.toList
>>>
Stream.toList $ Stream.scanl line input
["","h","he","hel","hell","hello","hello\n"]
Parallel Distribution
teeWith :: Monad m => (b -> c -> d) -> Scanl m a b -> Scanl m a c -> Scanl m a d Source #
teeWith k f1 f2
distributes its input to both f1
and f2
until any
one of them terminates. The outputs of the two scans are combined using the
function k
.
Definition:
>>>
teeWith k f1 f2 = fmap (uncurry k) (Scanl.tee f1 f2)
Example:
>>>
avg = Scanl.teeWith (/) Scanl.sum (fmap fromIntegral Scanl.length)
>>>
Stream.toList $ Stream.postscanl avg $ Stream.fromList [1.0..10.0]
[1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
Note that nested applications of teeWith do not fuse.
Pre-release
Transforming inner monad
morphInner :: (forall x. m x -> n x) -> Scanl m a b -> Scanl n a b Source #
Change the underlying monad of a scan. Also known as hoist.
Pre-release
generalizeInner :: Monad m => Scanl Identity a b -> Scanl m a b Source #
Adapt a pure scan to any monad.
>>>
generalizeInner = Scanl.morphInner (return . runIdentity)
Pre-release
Types
Running Incremental Scans
Scans of type Scanl m (Incr a) b
are incremental sliding-window
scans. Names of such scans are prefixed with incr
. An input of type
(Insert a)
indicates that the input element a
is being inserted in
the window without ejecting an old value, increasing the window size by
1. An input of type (Replace a a)
indicates that the first argument of
Replace is being removed from the window and the second argument is being
inserted in the window, the window size remains the same. The window
size can only increase and never decrease.
You can compute the statistics over the entire stream using window folds
by always supplying input of type Insert a
.
The incremental scans are converted into scans over a window using the
incrScan
operation which maintains a sliding window and supplies the
new and/or exiting element of the window to the window scan in the form
of an incremental operation. The names of window scans are prefixed with
window
.
cumulativeScan :: Scanl m (Incr a) b -> Scanl m a b Source #
Convert an incremental scan to a cumulative scan using the entire input stream as a single window.
>>>
cumulativeScan = Scanl.lmap Scanl.Insert
incrScan :: forall m a b. (MonadIO m, Unbox a) => Int -> Scanl m (Incr a) b -> Scanl m a b Source #
incrScan collector
is an incremental sliding window scan that does not
require all the intermediate elements in each step of the scan computation.
This maintains n
elements in the window, when a new element comes it
slides out the oldest element. The new element along with the old element
are supplied to the collector fold.
incrScanWith :: forall m a b. (MonadIO m, Unbox a) => Int -> Scanl m (Incr a, RingArray a) b -> Scanl m a b Source #
Like incrScan
but also provides the ring array to the scan. The ring
array reflects the state of the ring after inserting the incoming element.
IMPORTANT NOTE: The ring is mutable, therefore, references to it should not be stored and used later, the state would have changed by then. If you need to store it then copy it to an array or another ring and store it.
Incremental Scans
incrRollingMap :: Monad m => (Maybe a -> a -> Maybe b) -> Scanl m (Incr a) (Maybe b) Source #
Apply a pure function on the latest and the oldest element of the window.
>>>
incrRollingMap f = Scanl.incrRollingMapM (\x y -> return $ f x y)
incrRollingMapM :: Monad m => (Maybe a -> a -> m (Maybe b)) -> Scanl m (Incr a) (Maybe b) Source #
Apply an effectful function on the entering and the exiting element of the window. The first argument of the mapped function is the exiting element and the second argument is the entering element.
Sums
incrCount :: (Monad m, Num b) => Scanl m (Incr a) b Source #
The number of elements in the rolling window.
This is the \(0\)th power sum.
>>>
incrCount = Scanl.incrPowerSum 0
incrSum :: forall m a. (Monad m, Num a) => Scanl m (Incr a) a Source #
Sum of all the elements in a rolling window:
\(S = \sum_{i=1}^n x_{i}\)
This is the first power sum.
>>>
incrSum = Scanl.incrPowerSum 1
Uses Kahan-Babuska-Neumaier style summation for numerical stability of floating precision arithmetic.
Space: \(\mathcal{O}(1)\)
Time: \(\mathcal{O}(n)\)
incrPowerSum :: (Monad m, Num a) => Int -> Scanl m (Incr a) a Source #
Sum of the \(k\)th power of all the elements in a rolling window:
\(S_k = \sum_{i=1}^n x_{i}^k\)
>>>
incrPowerSum k = Scanl.lmap (fmap (^ k)) Scanl.incrSum
Space: \(\mathcal{O}(1)\)
Time: \(\mathcal{O}(n)\)
incrPowerSumFrac :: (Monad m, Floating a) => a -> Scanl m (Incr a) a Source #
Like incrPowerSum
but powers can be negative or fractional. This is
slower than incrPowerSum
for positive intergal powers.
>>>
incrPowerSumFrac p = Scanl.lmap (fmap (** p)) Scanl.incrSum
Location
windowRange :: forall m a. (MonadIO m, Unbox a, Ord a) => Int -> Scanl m a (Maybe (a, a)) Source #
Determine the maximum and minimum in a rolling window.
This implementation traverses the entire window buffer to compute the
range whenever we demand it. It performs better than the dequeue based
implementation in streamly-statistics
package when the window size is
small (< 30).
If you want to compute the range of the entire stream
range
would be much faster.
Space: \(\mathcal{O}(n)\) where n
is the window size.
Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.
windowMinimum :: (MonadIO m, Unbox a, Ord a) => Int -> Scanl m a (Maybe a) Source #
Find the minimum element in a rolling window.
See the performance related comments in windowRange
.
If you want to compute the minimum of the entire stream
minimum
is much faster.
Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.
windowMaximum :: (MonadIO m, Unbox a, Ord a) => Int -> Scanl m a (Maybe a) Source #
The maximum element in a rolling window.
See the performance related comments in windowRange
.
If you want to compute the maximum of the entire stream
maximum
would be much faster.
Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.
incrMean :: forall m a. (Monad m, Fractional a) => Scanl m (Incr a) a Source #
Arithmetic mean of elements in a sliding window:
\(\mu = \frac{\sum_{i=1}^n x_{i}}{n}\)
This is also known as the Simple Moving Average (SMA) when used in the sliding window and Cumulative Moving Avergae (CMA) when used on the entire stream.
>>>
incrMean = Scanl.teeWith (/) Scanl.incrSum Scanl.incrCount
Space: \(\mathcal{O}(1)\)
Time: \(\mathcal{O}(n)\)
Scans
Accumulators
Semigroups and Monoids
sconcat :: (Monad m, Semigroup a) => a -> Scanl m a a Source #
Semigroup concat. Append the elements of an input stream to a provided starting value.
Definition:
>>>
sconcat = Scanl.mkScanl (<>)
>>>
semigroups = fmap Data.Monoid.Sum $ Stream.enumerateFromTo 1 3
>>>
Stream.toList $ Stream.scanl (Scanl.sconcat 3) semigroups
[Sum {getSum = 3},Sum {getSum = 4},Sum {getSum = 6},Sum {getSum = 9}]
mconcat :: (Monad m, Monoid a) => Scanl m a a Source #
Monoid concat. Fold an input stream consisting of monoidal elements using
mappend
and mempty
.
Definition:
>>>
mconcat = Scanl.sconcat mempty
>>>
monoids = fmap Data.Monoid.Sum $ Stream.enumerateFromTo 1 3
>>>
Stream.toList $ Stream.scanl Scanl.mconcat monoids
[Sum {getSum = 0},Sum {getSum = 1},Sum {getSum = 3},Sum {getSum = 6}]
foldMap :: (Monad m, Monoid b) => (a -> b) -> Scanl m a b Source #
Definition:
>>>
foldMap f = Scanl.lmap f Scanl.mconcat
Make a fold from a pure function that folds the output of the function
using mappend
and mempty
.
>>>
sum = Scanl.foldMap Data.Monoid.Sum
>>>
Stream.toList $ Stream.scanl sum $ Stream.enumerateFromTo 1 3
[Sum {getSum = 0},Sum {getSum = 1},Sum {getSum = 3},Sum {getSum = 6}]
foldMapM :: (Monad m, Monoid b) => (a -> m b) -> Scanl m a b Source #
Definition:
>>>
foldMapM f = Scanl.lmapM f Scanl.mconcat
Make a fold from a monadic function that folds the output of the function
using mappend
and mempty
.
>>>
sum = Scanl.foldMapM (return . Data.Monoid.Sum)
>>>
Stream.toList $ Stream.scanl sum $ Stream.enumerateFromTo 1 3
[Sum {getSum = 0},Sum {getSum = 1},Sum {getSum = 3},Sum {getSum = 6}]
Reducers
drainMapM :: Monad m => (a -> m b) -> Scanl m a () Source #
Definitions:
>>>
drainMapM f = Scanl.lmapM f Scanl.drain
>>>
drainMapM f = Scanl.foldMapM (void . f)
Drain all input after passing it through a monadic function. This is the dual of mapM_ on stream producers.
mean :: (Monad m, Fractional a) => Scanl m a a Source #
Compute a numerically stable arithmetic mean of all elements in the input stream.
rollingHash :: (Monad m, Enum a) => Scanl m a Int64 Source #
Compute an Int
sized polynomial rolling hash of a stream.
>>>
rollingHash = Scanl.rollingHashWithSalt Scanl.defaultSalt
defaultSalt :: Int64 Source #
A default salt used in the implementation of rollingHash
.
rollingHashWithSalt :: (Monad m, Enum a) => Int64 -> Scanl m a Int64 Source #
Compute an Int
sized polynomial rolling hash
H = salt * k ^ n + c1 * k ^ (n - 1) + c2 * k ^ (n - 2) + ... + cn * k ^ 0
Where c1
, c2
, cn
are the elements in the input stream and k
is a
constant.
This hash is often used in Rabin-Karp string search algorithm.
rollingHashFirstN :: (Monad m, Enum a) => Int -> Scanl m a Int64 Source #
Compute an Int
sized polynomial rolling hash of the first n elements of
a stream.
>>>
rollingHashFirstN n = Scanl.take n Scanl.rollingHash
Pre-release
Saturating Reducers
product
terminates if it becomes 0. Other folds can theoretically
saturate on bounded types, and therefore terminate, however, they will
run forever on unbounded types like Integer/Double.
sum :: (Monad m, Num a) => Scanl m a a Source #
Determine the sum of all elements of a stream of numbers. Returns additive
identity (0
) when the stream is empty. Note that this is not numerically
stable for floating point numbers.
>>>
sum = Scanl.cumulativeScan Scanl.incrSum
Same as following but numerically stable:
>>>
sum = Scanl.mkScanl (+) 0
>>>
sum = fmap Data.Monoid.getSum $ Scanl.foldMap Data.Monoid.Sum
product :: (Monad m, Num a, Eq a) => Scanl m a a Source #
Determine the product of all elements of a stream of numbers. Returns
multiplicative identity (1
) when the stream is empty. The fold terminates
when it encounters (0
) in its input.
Same as the following but terminates on multiplication by 0
:
>>>
product = fmap Data.Monoid.getProduct $ Scanl.foldMap Data.Monoid.Product
Collectors
Avoid using these folds in scalable or performance critical applications, they buffer all the input in GC memory which can be detrimental to performance if the input is large.
toListRev :: Monad m => Scanl m a [a] Source #
Buffers the input stream to a list in the reverse order of the input.
Definition:
>>>
toListRev = Scanl.mkScanl (flip (:)) []
Warning! working on large lists accumulated as buffers in memory could be very inefficient, consider using Streamly.Array instead.
This is more efficient than toList
. toList is
exactly the same as reversing the list after toListRev
.
toStream :: (Monad m, Monad n) => Scanl m a (Stream n a) Source #
A fold that buffers its input to a pure stream.
Warning! working on large streams accumulated as buffers in memory could be very inefficient, consider using Streamly.Data.Array instead.
>>>
toStream = fmap Stream.fromList Scanl.toList
Pre-release
toStreamRev :: (Monad m, Monad n) => Scanl m a (Stream n a) Source #
Buffers the input stream to a pure stream in the reverse order of the input.
>>>
toStreamRev = fmap Stream.fromList Scanl.toListRev
Warning! working on large streams accumulated as buffers in memory could be very inefficient, consider using Streamly.Data.Array instead.
Pre-release
topBy :: (MonadIO m, Unbox a) => (a -> a -> Ordering) -> Int -> Scanl m a (MutArray a) Source #
Get the top n
elements using the supplied comparison function.
To get bottom n elements instead:
>>>
bottomBy cmp = Scanl.topBy (flip cmp)
Example:
>>>
stream = Stream.fromList [2::Int,7,9,3,1,5,6,11,17]
>>>
Stream.toList (Stream.scanl (Scanl.topBy compare 3) stream) >>= mapM MutArray.toList
[[],[17],[17,11],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9]]
Pre-release
top :: (MonadIO m, Unbox a, Ord a) => Int -> Scanl m a (MutArray a) Source #
Fold the input stream to top n elements.
Definition:
>>>
top = Scanl.topBy compare
>>>
stream = Stream.fromList [2::Int,7,9,3,1,5,6,11,17]
>>>
Stream.toList (Stream.scanl (Scanl.top 3) stream) >>= mapM MutArray.toList
[[],[17],[17,11],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9],[17,11,9]]
Pre-release
bottomBy :: (MonadIO m, Unbox a) => (a -> a -> Ordering) -> Int -> Scanl m a (MutArray a) Source #
Get the bottom most n
elements using the supplied comparison function.
bottom :: (MonadIO m, Unbox a, Ord a) => Int -> Scanl m a (MutArray a) Source #
Fold the input stream to bottom n elements.
Definition:
>>>
bottom = Scanl.bottomBy compare
>>>
stream = Stream.fromList [2::Int,7,9,3,1,5,6,11,17]
>>>
Stream.toList (Stream.scanl (Scanl.bottom 3) stream) >>= mapM MutArray.toList
[[],[1],[1,2],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3]]
Pre-release
Scanners
Stateful transformation of the elements. Useful in combination with
the postscanlMaybe
combinator. For scanners the result of the fold is
usually a transformation of the current element rather than an
aggregation of all elements till now.
, nthLast -- using RingArray array
indexingWith :: Monad m => Int -> (Int -> Int) -> Scanl m a (Maybe (Int, a)) Source #
Pair each element of a scan input with its index, starting from index 0.
indexingRev :: Monad m => Int -> Scanl m a (Maybe (Int, a)) Source #
>>>
indexingRev n = Scanl.indexingWith n (subtract 1)
rollingMap :: Monad m => (Maybe a -> a -> b) -> Scanl m a b Source #
>>>
rollingMap f = Scanl.rollingMapM (\x y -> return $ f x y)
rollingMapM :: Monad m => (Maybe a -> a -> m b) -> Scanl m a b Source #
Apply a function on every two successive elements of a stream. The first
argument of the map function is the previous element and the second argument
is the current element. When processing the very first element in the
stream, the previous element is Nothing
.
Pre-release
Filters
Useful in combination with the postscanlMaybe
combinator.
deleteBy :: Monad m => (a -> a -> Bool) -> a -> Scanl m a (Maybe a) Source #
Returns the latest element omitting the first occurrence that satisfies the given equality predicate.
Example:
>>>
input = Stream.fromList [1,3,3,5]
>>>
Stream.toList $ Stream.postscanlMaybe (Scanl.deleteBy (==) 3) input
[1,3,5]
uniqBy :: Monad m => (a -> a -> Bool) -> Scanl m a (Maybe a) Source #
Return the latest unique element using the supplied comparison function.
Returns Nothing
if the current element is same as the last element
otherwise returns Just
.
Example, strip duplicate path separators:
>>>
input = Stream.fromList "//a//b"
>>>
f x y = x == '/' && y == '/'
>>>
Stream.toList $ Stream.postscanlMaybe (Scanl.uniqBy f) input
"/a/b"
Space: O(1)
Pre-release
findIndices :: Monad m => (a -> Bool) -> Scanl m a (Maybe Int) Source #
Returns the index of the latest element if the element satisfies the given predicate.
elemIndices :: (Monad m, Eq a) => a -> Scanl m a (Maybe Int) Source #
Returns the index of the latest element if the element matches the given value.
Definition:
>>>
elemIndices a = Scanl.findIndices (== a)
Multi folds
Terminate after consuming one or more elements.
drainN :: Monad m => Int -> Scanl m a () Source #
A fold that drains the first n elements of its input, running the effects and discarding the results.
Definition:
>>>
drainN n = Scanl.take n Scanl.drain
Pre-release
Trimmers
Useful in combination with the postscanlMaybe
combinator.
takingEndBy :: Monad m => (a -> Bool) -> Scanl m a (Maybe a) Source #
>>>
takingEndBy p = Scanl.takingEndByM (return . p)
takingEndBy_ :: Monad m => (a -> Bool) -> Scanl m a (Maybe a) Source #
>>>
takingEndBy_ p = Scanl.takingEndByM_ (return . p)
droppingWhile :: Monad m => (a -> Bool) -> Scanl m a (Maybe a) Source #
>>>
droppingWhile p = Scanl.droppingWhileM (return . p)
prune :: (a -> Bool) -> Scanl m a (Maybe a) Source #
Strip all leading and trailing occurrences of an element passing a predicate and make all other consecutive occurrences uniq.
> prune p = Stream.dropWhileAround p $ Stream.uniqBy (x y -> p x && p y)
> Stream.prune isSpace (Stream.fromList " hello world! ") "hello world!"
Space: O(1)
Unimplemented
Combinators
Utilities
with :: (Scanl m (s, a) b -> Scanl m a b) -> (((s, a) -> c) -> Scanl m (s, a) b -> Scanl m (s, a) b) -> ((s, a) -> c) -> Scanl m a b -> Scanl m a b Source #
Change the predicate function of a Fold from a -> b
to accept an
additional state input (s, a) -> b
. Convenient to filter with an
addiitonal index or time input.
>>>
filterWithIndex = Scanl.with Scanl.indexed Scanl.filter
filterWithAbsTime = with timestamped filter filterWithRelTime = with timeIndexed filter
Pre-release
Scanning Input
pipe :: Monad m => Pipe m a b -> Scanl m b c -> Scanl m a c Source #
Attach a Pipe
on the input of a Fold
.
Pre-release
indexed :: Monad m => Scanl m (Int, a) b -> Scanl m a b Source #
Pair each element of a fold input with its index, starting from index 0.
>>>
indexed = Scanl.postscanlMaybe Scanl.indexing
Zipping Input
zipStreamWithM :: (a -> b -> m c) -> Stream m a -> Scanl m c x -> Scanl m b x Source #
Zip a stream with the input of a fold using the supplied function.
Unimplemented
zipStream :: Monad m => Stream m a -> Scanl m (a, b) x -> Scanl m b x Source #
Zip a stream with the input of a fold.
>>>
zip = Scanl.zipStreamWithM (curry return)
Unimplemented
Filtering Input
mapMaybeM :: Monad m => (a -> m (Maybe b)) -> Scanl m b r -> Scanl m a r Source #
>>>
mapMaybeM f = Scanl.lmapM f . Scanl.catMaybes
mapMaybe :: Monad m => (a -> Maybe b) -> Scanl m b r -> Scanl m a r Source #
mapMaybe f fold
maps a Maybe
returning function f
on the input of
the fold, filters out Nothing
elements, and return the values extracted
from Just
.
>>>
mapMaybe f = Scanl.lmap f . Scanl.catMaybes
>>>
mapMaybe f = Scanl.mapMaybeM (return . f)
>>>
f x = if even x then Just x else Nothing
>>>
scn = Scanl.mapMaybe f Scanl.toList
>>>
Stream.toList $ Stream.scanl scn (Stream.enumerateFromTo 1 10)
[[],[],[2],[2],[2,4],[2,4],[2,4,6],[2,4,6],[2,4,6,8],[2,4,6,8],[2,4,6,8,10]]
sampleFromthen :: Monad m => Int -> Int -> Scanl m a b -> Scanl m a b Source #
sampleFromthen offset stride
samples the element at offset
index and
then every element at strides of stride
.
Parallel Distribution
tee :: Monad m => Scanl m a b -> Scanl m a c -> Scanl m a (b, c) Source #
Distribute one copy of the stream to each scan and zip the results.
|-------Scanl m a b--------| ---stream m a---| |---m (b,c) |-------Scanl m a c--------|
Definition:
>>>
tee = Scanl.teeWith (,)
Example:
>>>
t = Scanl.tee Scanl.sum Scanl.length
>>>
Stream.toList $ Stream.scanl t (Stream.enumerateFromTo 1.0 10.0)
[(0.0,0),(1.0,1),(3.0,2),(6.0,3),(10.0,4),(15.0,5),(21.0,6),(28.0,7),(36.0,8),(45.0,9),(55.0,10)]
distribute :: Monad m => [Scanl m a b] -> Scanl m a [b] Source #
Distribute one copy of the stream to each fold and collect the results in a container.
|-------Scanl m a b--------| ---stream m a---| |---m [b] |-------Scanl m a b--------| | | ...
>>>
Stream.toList $ Stream.scanl (Scanl.distribute [Scanl.sum, Scanl.length]) (Stream.enumerateFromTo 1 5)
[[0,0],[1,1],[3,2],[6,3],[10,4],[15,5]]
>>>
distribute = Prelude.foldr (Scanl.teeWith (:)) (Scanl.const [])
This is the consumer side dual of the producer side sequence
operation.
Stops as soon as any of the scans stop.
Unzipping
unzip :: Monad m => Scanl m a x -> Scanl m b y -> Scanl m (a, b) (x, y) Source #
Send the elements of tuples in a stream of tuples through two different folds.
|-------Scanl m a x--------| ---------stream of (a,b)--| |----m (x,y) |-------Scanl m b y--------|
Definition:
>>>
unzip = Scanl.unzipWith id
This is the consumer side dual of the producer side zip
operation.
unzipWith :: Monad m => (a -> (b, c)) -> Scanl m b x -> Scanl m c y -> Scanl m a (x, y) Source #
Split elements in the input stream into two parts using a pure splitter function, direct each part to a different fold and zip the results.
Definitions:
>>>
unzipWith f = Scanl.unzipWithM (return . f)
>>>
unzipWith f fld1 fld2 = Scanl.lmap f (Scanl.unzip fld1 fld2)
This scan terminates as soon as any of the input scans terminate.
Pre-release
unzipWithM :: Monad m => (a -> m (b, c)) -> Scanl m b x -> Scanl m c y -> Scanl m a (x, y) Source #
Like unzipWith
but with a monadic splitter function.
Definition:
>>>
unzipWithM k f1 f2 = Scanl.lmapM k (Scanl.unzip f1 f2)
Pre-release
Partitioning
partitionByM :: Monad m => (a -> m (Either b c)) -> Scanl m b x -> Scanl m c x -> Scanl m a x Source #
Partition the input over two scans using an Either
partitioning
predicate.
|-------Scanl b x--------| -----stream m a --> (Either b c)----| |----(x,y) |-------Scanl c y--------|
Example, send input to either fold randomly:
>>>
:set -package random
>>>
import System.Random (randomIO)
>>>
randomly a = randomIO >>= \x -> return $ if x then Left a else Right a
>>>
f = Scanl.partitionByM randomly Scanl.length Scanl.length
>>>
Stream.toList $ Stream.scanl f (Stream.enumerateFromTo 1 10)
...
Example, send input to the two folds in a proportion of 2:1:
>>>
:set -fno-warn-unrecognised-warning-flags
>>>
:set -fno-warn-x-partial
>>>
:{
proportionately m n = do ref <- newIORef $ cycle $ concat [replicate m Left, replicate n Right] return $ \a -> do r <- readIORef ref writeIORef ref $ tail r return $ Prelude.head r a :}
>>>
:{
main = do g <- proportionately 2 1 let f = Scanl.partitionByM g Scanl.length Scanl.length r <- Stream.toList $ Stream.scanl f (Stream.enumerateFromTo (1 :: Int) 10) print r :}
>>>
main
...
This is the consumer side dual of the producer side mergeBy
operation.
Terminates as soon as any of the scans terminate.
Pre-release
partitionBy :: Monad m => (a -> Either b c) -> Scanl m b x -> Scanl m c x -> Scanl m a x Source #
Same as partitionByM
but with a pure partition function.
Example, count even and odd numbers in a stream:
>>>
:{
let f = Scanl.partitionBy (\n -> if even n then Left n else Right n) (fmap (("Even " ++) . show) Scanl.length) (fmap (("Odd " ++) . show) Scanl.length) in Stream.toList $ Stream.postscanl f (Stream.enumerateFromTo 1 10) :} ["Odd 1","Even 1","Odd 2","Even 2","Odd 3","Even 3","Odd 4","Even 4","Odd 5","Even 5"]
Pre-release
Nesting
unfoldMany :: Monad m => Unfold m a b -> Scanl m b c -> Scanl m a c Source #
Unfold and flatten the input stream of a fold.
Stream.scanl (unfoldMany u f) == Stream.scanl f . Stream.unfoldMany u
Pre-release
Set operations
toSet :: (Monad m, Ord a) => Scanl m a (Set a) Source #
Scan the input adding it to a set.
Definition:
>>>
toSet = Scanl.mkScanl (flip Set.insert) Set.empty
toIntSet :: Monad m => Scanl m Int IntSet Source #
Scan the input adding it to an int set. For integer inputs this performs
better than toSet
.
Definition:
>>>
toIntSet = Scanl.mkScanl (flip IntSet.insert) IntSet.empty
countDistinct :: (Monad m, Ord a) => Scanl m a Int Source #
Count non-duplicate elements in the stream.
Definition:
>>>
countDistinct = fmap Set.size Scanl.toSet
>>>
countDistinct = Scanl.postscanl Scanl.nub $ Scanl.catMaybes $ Scanl.length
The memory used is proportional to the number of distinct elements in the stream, to guard against using too much memory use it as a scan and terminate if the count reaches more than a threshold.
Space: \(\mathcal{O}(n)\)
Pre-release
countDistinctInt :: Monad m => Scanl m Int Int Source #
Like countDistinct
but specialized to a stream of Int
, for better
performance.
Definition:
>>>
countDistinctInt = fmap IntSet.size Scanl.toIntSet
>>>
countDistinctInt = Scanl.postscanl Scanl.nubInt $ Scanl.catMaybes $ Scanl.length
Pre-release
Map operations
Demultiplexing
Direct values in the input stream to different folds using an n-ary
fold selector. demux
is a generalization of classify
(and
partition
) where each key of the classifier can use a different fold.
You need to see only demux
if you are looking to find the capabilities
of these combinators, all others are variants of that.
demuxGeneric :: (Monad m, IsMap f, Traversable f) => (a -> Key f) -> (Key f -> m (Maybe (Scanl m a b))) -> Scanl m a (m (f b), Maybe (Key f, b)) Source #
This is the most general of all demux, classify operations.
The first component of the output tuple is a key-value Map of in-progress scans. The scan returns the scan result as the second component of the output tuple.
See demux
for documentation.
demux :: (Monad m, Ord k) => (a -> k) -> (k -> m (Maybe (Scanl m a b))) -> Scanl m a (Maybe (k, b)) Source #
demux getKey getScan
: In a key value stream, scan values corresponding
to each key using a key specific scan. getScan
is invoked to generate a
key specific scan when a key is encountered for the first time in the
stream. If a scan does not exist corresponding to the key then Nothing
is
returned otherwise the result of the scan is returned.
If a scan terminates, another instance of the scan is started upon receiving
an input with that key, getScan
is invoked again whenever the key is
encountered again.
This can be used to scan a stream, splitting it based on different keys.
Since the scan generator function is monadic we can add scans dynamically. For example, we can maintain a Map of keys to scans in an IORef and lookup the scan from that corresponding to a key. This Map can be changed dynamically, scans for new keys can be added or scans for old keys can be deleted or modified.
Compare with classify
, the scan in classify
is a static scan.
Pre-release
demuxGenericIO :: (MonadIO m, IsMap f, Traversable f) => (a -> Key f) -> (Key f -> m (Maybe (Scanl m a b))) -> Scanl m a (m (f b), Maybe (Key f, b)) Source #
This is specialized version of demuxGeneric
that uses mutable IO cells
as fold accumulators for better performance.
Keep in mind that the values in the returned Map may be changed by the ongoing scan if you are using those concurrently in another thread.
demuxIO :: (MonadIO m, Ord k) => (a -> k) -> (k -> m (Maybe (Scanl m a b))) -> Scanl m a (Maybe (k, b)) Source #
This is specialized version of demux
that uses mutable IO cells as scan
accumulators for better performance.
Classifying
In an input stream of key value pairs fold values for different keys
in individual output buckets using the given fold. classify
is a
special case of demux
where all the branches of the demultiplexer use
the same fold.
Different types of maps can be used with these combinators via the IsMap type class. Hashmap performs better when there are more collisions, trie Map performs better otherwise. Trie has an advantage of sorting the keys at the same time. For example if we want to store a dictionary of words and their meanings then trie Map would be better if we also want to display them in sorted order.
classifyGeneric :: (Monad m, IsMap f, Traversable f, Ord (Key f)) => (a -> Key f) -> Scanl m a b -> Scanl m a (m (f b), Maybe (Key f, b)) Source #
classify :: (MonadIO m, Ord k) => (a -> k) -> Scanl m a b -> Scanl m a (Maybe (k, b)) Source #
Scans the values for each key using the supplied scan.
Once the scan for a key terminates, any future values of the key are ignored.
Equivalent to the following except that the scan is not restarted:
>>>
classify f fld = Scanl.demux f (const fld)
classifyGenericIO :: (MonadIO m, IsMap f, Traversable f, Ord (Key f)) => (a -> Key f) -> Scanl m a b -> Scanl m a (m (f b), Maybe (Key f, b)) Source #
Be aware that the values in the intermediate Maps would be mutable.
classifyIO :: (MonadIO m, Ord k) => (a -> k) -> Scanl m a b -> Scanl m a (Maybe (k, b)) Source #
Same as classify except that it uses mutable IORef cells in the Map, providing better performance.
Equivalent to the following except that the scan is not restarted:
>>>
classifyIO f fld = Scanl.demuxIO f (const fld)