The goal of Streamly is to provide unified abstractions with high performance. The basic abstractions in Streamly are streams and arrays and yet it provides the functionality of many packages in the Haskell ecosystem.
This document discusses the related packages in the Haskell ecosystem and how streamly unifies, overlaps, or compares with those. We provide simple code snippets for illustrations, for a better overview of the library please see the Streamly Quick Overview document.
In the Haskell ecosystem effectful streaming functionality is provided by several streaming libraries e.g.
streaming, nested looping by list transformers e.g.
list-t, interleaved scheduling by
logict, concurrency by
async, time-domain programming by FRP libraries like
reflex, and array functionality by
For basic programming needs, one needs to discover and learn all of these or roll something on their own. These libraries have evolved independently for specialized needs. They are not designed with a big picture in mind to eliminate duplicate functionality and to come up with the most optimal abstractions on the whole. For example, there is no reason why streaming, list-t, and logict cannot be provided by the same implementation. Concurrency and reactive programming also naturally fit with the streaming model and can be unified.
arrays all provide the same functionality which can be unified under one umbrella of arrays. Moreover, there are many of flavors some of these packages like lazy bytestring, Char8 bytestrings, lazy text which can be eliminated by using streams instead.
On the performance side, existing streaming libraries lack stream fusion leading to a function call overhead in each iteration of the loop, therefore, providing orders of magnitude lower performance in tight loops compared to writing a monolithic loop by hand. Similarly,
logict shows quadratic performance characteristics in some basic use cases. These problems are solved by Streamly.
Streamly unifies streaming, nested looping, scheduling, concurrency, time, and array functionality using two basic constructs, namely streams and arrays. Streams provide efficient, immutable, composable serial processing capabilities for in-flight data, whereas arrays provide efficient storage, mutability, and random access for data at rest.
Streamly builds all the functionality on top of these two building blocks. It takes a big picture view of programming in general and provides well-integrated APIs with the same look and feel. Moreover, performance is a primary goal of streamly, all the functionality is built for performance comparable to C.
Streamly streams are like lists and provide non-determinism just like lists. Streamly also adds asynchronicity and concurrency to streaming composition. This seemingly simple change unifies several disparate abstractions into one powerful, concise, and elegant abstraction. A wide variety of programming problems can be solved elegantly with this abstraction. In particular, it unifies three major programming domains namely non-deterministic (logic) programming, concurrent programming, and functional Reactive programming.
The basic, bare-bones functionality of Streamly is processing streams of data. In simple terms, stream processing is the functional equivalent of loops in imperative programming.
Like Haskell lists,
streaming packages, Streamly composes streams of data rather than stream processor functions as in other streaming libraries like
machines. This makes the types and API very simple and is very similar to Haskell lists which is familiar to everyone. The fundamental difference is that Streamly adds concurrency support but the good thing is that it does not change the API.
This simple console echo program shows the simplicity of Streamly API and its similarity with the list API:
echo = Stream.repeatM getLine & Stream.mapM putStrLn & Stream.drain
Streamly uses dual representation for streams. On top it uses Scott encoded CPS streams which provide a way to incrementally construct and append streams efficiently. Under the good, for tight loops, Streamly uses a vector-like stream representation which is amenable to
SPEC constructor optimizations by GHC resulting in low-level code having performance comparable to C.
Roughly speaking, non-determinism is a fancy term that functional programmers use for what you call nested loops in imperative programming. List transformers are the basic implementations for non-determinism.
The stream monad in Streamly is a list-transformer with behavior similar to the list monad. It provides the functionality provided by
logict packages for free. Here is an example of nested looping using the serial stream monad:
```language- haskell import qualified Streamly.Prelude as S
loops = do x <- Stream.fromFoldable [1,2] y <- Stream.fromFoldable [3,4] Stream.fromEffect $ print (x, y)
Moreover, the list transformer in Streamly can be concurrent. The Scott encoding of streams also avoids the quadratic performance issue of `logict`. ## Scheduling Behaviors When we execute a stream or combine two streams, the actions in the stream need to be scheduled for execution. Existing streaming libraries are limited to just one form of scheduling of actions in the stream which is serial execution. Streamly provides several ways of scheduling the actions in the stream. The good thing is the API does not change, we just need to use a combinator or a different type to change the scheduling behavior. In a single stream, the actions in the stream can be executed concurrently with different concurrent scheduling behaviors. Two or more streams can be combined in several ways. For example, the actions from two streams being combined can be interleaved. Similarly, streams can be interleaved with concurrent execution providing a fair, concurrent round-robin scheduling of streams. The monad instance provides a convenient way to combine streams in different ways. It is just non-determinism with different flavors of scheduling behavior. The example from the previous section can be run with interleaved scheduling behavior as follows, without changing the code at all: ```language-haskell main = Stream.drain $ Stream.fromWserial loops
Scheduling is fundamental to expressing many common programming problems in an idiomatic functional manner. One example application of interleaving is breadth-first search mechanism for logic programming. Please see mini kanren implementation using streamly.
The same combinators that are used for serial streams e.g. ‘unfoldrM’, ‘replicateM’, ‘repeatM’ work concurrently when used at the appropriate type. It allows concurrent programs to be written declaratively and composed idiomatically. They are not much different than serial programs. See Streamly vs async for a comparison of Streamly with the
Streamly provides concurrent scheduling and looping similar to to OpenMP and Cilk but with a more declarative style. The list transformer example can be run with concurrent execution of loop iterations as follows, without changing the code at all:
main = Stream.drain $ Stream.fromAhead loops
And interleaving with concurrent execution of the loop iterations can be written like this:
main = Stream.drain $ Stream.fromWAsync loops
All this comes with no change in the streaming APIs.
The combination of non-determinism, concurrency, and streaming makes Streamly a strong reactive programming library as well. Reactive programming fundamentally deals with streams of events that can be processed concurrently. The https://github.com/composewell/streamly-examples/tree/master/AcidRain.hs and Circling Square examples demonstrate the basic reactive capability of Streamly.
In core concepts, Streamly is strikingly similar to
dunai was designed from a FRP perspective and Streamly was originally designed from a concurrency perspective. However, both have similarities at the core.
Streamly provides immutable, mutable, pinned, unpinned, boxed, and unboxed arrays with streaming interfaces. The combination of efficient streaming and polymorphic arrays lets it express the functionality of
text packages as special cases of arrays with no loss of performance. Since we can use explicit streaming, the lazy versions of
text are not required.
Streamly, provides effectful streams, with a simple API, almost identical to standard lists, and in-built support for concurrency. By using stream-style combinators on stream composition, streams can be generated, merged, chained, mapped, zipped, and consumed concurrently – providing a generalized high-level programming framework unifying streaming and concurrency. Controlled concurrency allows even infinite streams to be evaluated concurrently. Concurrency is auto-scaled based on feedback from the stream consumer.
Streamly is a programmer-first library, designed to be useful and friendly to programmers for solving practical problems in a simple and concise manner. Some key points that Streamly stresses are:
vectorlibrary in most cases and even better in some cases. Concurrent performance is unbeatable. See streaming-benchmarks for a comparison of popular streaming libraries on micro-benchmarks.
Streamly unifies the functionality overlapping the following Haskell libraries: