Setup
>>>
:m
>>>
:set XFlexibleContexts
>>>
:set XMagicHash
>>>
import Data.Function ((&))
>>>
import Data.Functor.Identity (Identity(..))
>>>
import System.IO.Unsafe (unsafePerformIO)
>>>
import Streamly.Data.Array (Array)
>>>
import Streamly.Data.Stream (Stream)
>>>
import qualified Streamly.Data.Array as Array
>>>
import qualified Streamly.Data.Fold as Fold
>>>
import qualified Streamly.Data.ParserK as ParserK
>>>
import qualified Streamly.Data.Stream as Stream
>>>
import qualified Streamly.Data.StreamK as StreamK
For APIs that have not been released yet.
>>>
import qualified Streamly.Internal.Data.Array as Array
>>>
import qualified Streamly.Internal.Data.Stream as Stream
Design Notes
To summarize:
 Arrays are finite and fixed in size
 provide O(1) access to elements
 store only data and not functions
 provide efficient IO interfacing
Foldable
instance is not provided because the implementation would be much
less efficient compared to folding via streams. Semigroup
and Monoid
instances should be used with care; concatenating arrays using binary
operations can be highly inefficient. Instead, use
toArray
to concatenate N
arrays at once.
Each array is one pointer visible to the GC. Too many small arrays (e.g. single byte) are only as good as holding those elements in a Haskell list. However, small arrays can be compacted into large ones to reduce the overhead. To hold 32GB memory in 32k sized buffers we need 1 million arrays if we use one array for each chunk. This is still significant to add pressure to GC.
The Array Type
Type
We can use an Unbox
constraint in the Array type and the constraint can
be automatically provided to a function that pattern matches on the Array
type. However, it has huge performance cost, so we do not use it.
Investigate a GHC improvement possiblity.
Array  

Instances
a ~ Char => IsString (Array a) Source #  
Defined in Streamly.Internal.Data.Array.Type fromString :: String > Array a Source #  
Unbox a => Monoid (Array a) Source #  
Unbox a => Semigroup (Array a) Source #  This should not be used for combining many or N arrays as it would copy
the two arrays everytime to a new array. For coalescing multiple arrays use

Unbox a => IsList (Array a) Source #  
(Unbox a, Read a, Show a) => Read (Array a) Source #  
(Show a, Unbox a) => Show (Array a) Source #  
Eq (Array Int16) Source #  
Eq (Array Int32) Source #  
Eq (Array Int64) Source #  
Eq (Array Int8) Source #  
Eq (Array Word16) Source #  
Eq (Array Word32) Source #  
Eq (Array Word64) Source #  
Eq (Array Word8) Source #  
Eq (Array Char) Source #  
Eq (Array Int) Source #  
(Unbox a, Eq a) => Eq (Array a) Source #  If the type allows a bytebybyte comparison this instance can be
overlapped by a more specific instance that uses 
(Unbox a, Ord a) => Ord (Array a) Source #  
Defined in Streamly.Internal.Data.Array.Type  
Serialize (Array a) Source #  
Defined in Streamly.Internal.Data.Serialize.Type  
type Item (Array a) Source #  
Defined in Streamly.Internal.Data.Array.Type 
Conversion
Mutable and Immutable
unsafeFreeze :: MutArray a > Array a Source #
Makes an immutable array using the underlying memory of the mutable array.
Please make sure that there are no other references to the mutable array lying around, so that it is never used after freezing it using unsafeFreeze. If the underlying array is mutated, the immutable promise is lost.
Prerelease
unsafeFreezeWithShrink :: Unbox a => MutArray a > Array a Source #
Similar to unsafeFreeze
but uses rightSize
on the mutable array
first.
unsafeThaw :: Array a > MutArray a Source #
Makes a mutable array using the underlying memory of the immutable array.
Please make sure that there are no other references to the immutable array lying around, so that it is never used after thawing it using unsafeThaw. If the resulting array is mutated, any references to the older immutable array are mutated as well.
Prerelease
Pinned and Unpinned
pin :: Array a > IO (Array a) Source #
Return a copy of the Array
in pinned memory if unpinned, else return the
original array.
unpin :: Array a > IO (Array a) Source #
Return a copy of the Array
in unpinned memory if pinned, else return the
original array.
Casting
unsafePinnedAsPtr :: MonadIO m => Array a > (Ptr a > m b) > m b Source #
Use an Array a
as Ptr a
.
See unsafePinnedAsPtr
in the Mutable array module for more details.
Unsafe
Prerelease
Construction
Cloning
Slicing
Get a subarray without copying
splitAt :: Unbox a => Int > Array a > (Array a, Array a) Source #
Create two slices of an array without copying the original array. The
specified index i
is the first index of the second slice.
Stream Folds
unsafeMakePure :: Monad m => Fold IO a b > Fold m a b Source #
Fold "step" has a dependency on "initial", and each step is dependent on the previous invocation of step due to state passing, finally extract depends on the result of step, therefore, as long as the fold is driven in the correct order the operations would be correctly ordered. We need to ensure that we strictly evaluate the previous step completely before the next step.
To not share the same array we need to make sure that the result of "initial" is not shared. Existential type ensures that it does not get shared across different folds. However, if we invoke "initial" multiple times for the same fold, there is a possiblity of sharing the two because the compiler would consider it as a pure value. One such example is the chunksOf combinator, or using an array creation fold with foldMany combinator. Is there a proper way in GHC to tell it to not share a pure expression in a particular case?
For this reason array creation folds have a MonadIO constraint. Pure folds could be unsafe and dangerous. This is dangerous especially when used with foldMany like operations.
>>>
unsafePureWrite = Array.unsafeMakePure Array.write
createOf :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
createOf n
folds a maximum of n
elements from the input stream to an
Array
.
pinnedCreateOf :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Like createOf
but creates a pinned array.
unsafeCreateOf :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Like createOf
but does not check the array bounds when writing. The fold
driver must not call the step function more than n
times otherwise it will
corrupt the memory and crash. This function exists mainly because any
conditional in the step function blocks fusion causing 10x performance
slowdown.
create :: forall m a. (MonadIO m, Unbox a) => Fold m a (Array a) Source #
Fold the whole input to a single array.
Caution! Do not use this on infinite streams.
pinnedCreate :: forall m a. (MonadIO m, Unbox a) => Fold m a (Array a) Source #
Like create
but creates a pinned array.
From containers
fromListN :: Unbox a => Int > [a] > Array a Source #
Create an Array
from the first N elements of a list. The array is
allocated to size N, if the list terminates before N elements then the
array may hold less than N elements.
pinnedFromListN :: Unbox a => Int > [a] > Array a Source #
Like fromListN
but creates a pinned array.
fromList :: Unbox a => [a] > Array a Source #
Create an Array
from a list. The list must be of finite size.
fromListRevN :: Unbox a => Int > [a] > Array a Source #
Create an Array
from the first N elements of a list in reverse order.
The array is allocated to size N, if the list terminates before N elements
then the array may hold less than N elements.
Prerelease
fromListRev :: Unbox a => [a] > Array a Source #
Create an Array
from a list in reverse order. The list must be of finite
size.
Prerelease
fromStreamN :: (MonadIO m, Unbox a) => Int > Stream m a > m (Array a) Source #
Create an Array
from the first N elements of a stream. The array is
allocated to size N, if the stream terminates before N elements then the
array may hold less than N elements.
>>>
fromStreamN n = Stream.fold (Array.writeN n)
Prerelease
fromStream :: (MonadIO m, Unbox a) => Stream m a > m (Array a) Source #
Create an Array
from a stream. This is useful when we want to create a
single array from a stream of unknown size. writeN
is at least twice
as efficient when the size is already known.
>>>
fromStream = Stream.fold Array.write
Note that if the input stream is too large memory allocation for the array
may fail. When the stream size is not known, chunksOf
followed by
processing of indvidual arrays in the resulting stream should be preferred.
Prerelease
fromPureStream :: Unbox a => Stream Identity a > Array a Source #
Convert a pure stream in Identity monad to an immutable array.
Same as the following but with better performance:
>>>
fromPureStream = Array.fromList . runIdentity . Stream.toList
fromByteStr# :: Addr# > Array Word8 Source #
Copy a null terminated immutable Addr#
Word8 sequence into an array.
Unsafe: The caller is responsible for safe addressing.
Note that this is completely safe when reading from Haskell string literals because they are guaranteed to be NULL terminated:
>>>
Array.toList $ Array.fromByteStr# "\1\2\3\0"#
[1,2,3]
Note that this should be evaluated strictly to ensure that we do not hold the reference to the pointer in a lazy thunk.
fromByteStr :: Ptr Word8 > Array Word8 Source #
Note that this should be evaluated strictly to ensure that we do not hold the reference to the pointer in a lazy thunk.
fromPtrN :: Int > Ptr Word8 > Array Word8 Source #
Copy an immutable 'Ptr Word8' sequence into an array.
Unsafe: The caller is responsible for safe addressing.
Note that this should be evaluated strictly to ensure that we do not hold the reference to the pointer in a lazy thunk.
fromChunks :: (MonadIO m, Unbox a) => Stream m (Array a) > m (Array a) Source #
Given a stream of arrays, splice them all together to generate a single array. The stream must be finite.
fromChunksK :: (MonadIO m, Unbox a) => StreamK m (Array a) > m (Array a) Source #
Convert an array stream to an array. Note that this requires peak memory that is double the size of the array stream.
Reading
Indexing
unsafeIndexIO :: forall a. Unbox a => Int > Array a > IO a Source #
Return element at the specified index without checking the bounds.
Unsafe because it does not check the bounds of the array.
getIndexUnsafe :: forall a. Unbox a => Int > Array a > a Source #
Return element at the specified index without checking the bounds.
To Streams
read :: (Monad m, Unbox a) => Array a > Stream m a Source #
Convert an Array
into a stream.
Prerelease
readRev :: (Monad m, Unbox a) => Array a > Stream m a Source #
Convert an Array
into a stream in reverse order.
Prerelease
To Containers
Unfolds
readerUnsafe :: forall m a. (Monad m, Unbox a) => Unfold m (Array a) a Source #
Unfold an array into a stream, does not check the end of the array, the user is responsible for terminating the stream within the array bounds. For high performance application where the end condition can be determined by a terminating fold.
Written in the hope that it may be faster than "read", however, in the case for which this was written, "read" proves to be faster even though the core generated with unsafeRead looks simpler.
Prerelease
reader :: forall m a. (Monad m, Unbox a) => Unfold m (Array a) a Source #
Unfold an array into a stream.
readerRev :: forall m a. (Monad m, Unbox a) => Unfold m (Array a) a Source #
Unfold an array into a stream in reverse order.
Size
length :: Unbox a => Array a > Int Source #
O(1) Get the length of the array i.e. the number of elements in the array.
byteLength :: Array a > Int Source #
O(1) Get the byte length of the array.
Folding
byteCmp :: Array a > Array a > Ordering Source #
Byte compare two arrays. Compare the length of the arrays. If the length is equal, compare the lexicographical ordering of two underlying byte arrays otherwise return the result of length comparison.
Unsafe: Note that the Unbox
instance of sum types with constructors of
different sizes may leave some memory uninitialized which can make byte
comparison unreliable.
Prerelease
byteEq :: Array a > Array a > Bool Source #
Byte equality of two arrays.
>>>
byteEq arr1 arr2 = (==) EQ $ Array.byteCmp arr1 arr2
Unsafe: See byteCmp
.
Appending
splice :: MonadIO m => Array a > Array a > m (Array a) Source #
Copy two immutable arrays into a new array. If you want to splice more
than two arrays then this operation would be highly inefficient because it
would make a copy on every splice operation, instead use the
fromChunksK
operation to combine n immutable arrays.
Streams of arrays
Chunk
Group a stream into arrays.
chunksOf :: forall m a. (MonadIO m, Unbox a) => Int > Stream m a > Stream m (Array a) Source #
chunksOf n stream
groups the elements in the input stream into arrays of
n
elements each.
Same as the following but may be more efficient:
>>>
chunksOf n = Stream.foldMany (Array.writeN n)
Prerelease
pinnedChunksOf :: forall m a. (MonadIO m, Unbox a) => Int > Stream m a > Stream m (Array a) Source #
Like chunksOf
but creates pinned arrays.
Split
Split an array into slices.
Concat
Append the arrays in a stream to form a stream of elements.
concat :: (Monad m, Unbox a) => Stream m (Array a) > Stream m a Source #
Convert a stream of arrays into a stream of their elements.
>>>
concat = Stream.unfoldMany Array.reader
concatRev :: forall m a. (Monad m, Unbox a) => Stream m (Array a) > Stream m a Source #
Convert a stream of arrays into a stream of their elements reversing the contents of each array before flattening.
>>>
concatRev = Stream.unfoldMany Array.readerRev
Compact
Append the arrays in a stream to form a stream of larger arrays.
fCompactGE :: (MonadIO m, Unbox a) => Int > Fold m (Array a) (Array a) Source #
Fold fCompactGE n
coalesces adjacent arrays in the input stream
until the size becomes greater than or equal to n.
Generates unpinned arrays irrespective of the pinning status of input arrays.
fPinnedCompactGE :: (MonadIO m, Unbox a) => Int > Fold m (Array a) (Array a) Source #
PInned version of fCompactGE
.
lCompactGE :: (MonadIO m, Unbox a) => Int > Fold m (Array a) () > Fold m (Array a) () Source #
Like compactGE
but for transforming folds instead of stream.
>>>
lCompactGE n = Fold.many (Array.fCompactGE n)
Generates unpinned arrays irrespective of the pinning status of input arrays.
lPinnedCompactGE :: (MonadIO m, Unbox a) => Int > Fold m (Array a) () > Fold m (Array a) () Source #
Pinned version of lCompactGE
.
compactGE :: (MonadIO m, Unbox a) => Int > Stream m (Array a) > Stream m (Array a) Source #
compactGE n stream
coalesces adjacent arrays in the stream
until
the size becomes greater than or equal to n
.
>>>
compactGE n = Stream.foldMany (Array.fCompactGE n)
Generates unpinned arrays irrespective of the pinning status of input arrays.
Deprecated
asPtrUnsafe :: MonadIO m => Array a > (Ptr a > m b) > m b Source #
Deprecated: Please use unsafePinnedAsPtr instead.
unsafeIndex :: forall a. Unbox a => Int > Array a > a Source #
Deprecated: Please use getIndexUnsafe
instead
bufferChunks :: (MonadIO m, Unbox a) => Stream m a > m (StreamK m (Array a)) Source #
Deprecated: Please use buildChunks instead.
flattenArrays :: forall m a. (MonadIO m, Unbox a) => Stream m (Array a) > Stream m a Source #
Deprecated: Please use "unfoldMany reader" instead.
flattenArraysRev :: forall m a. (MonadIO m, Unbox a) => Stream m (Array a) > Stream m a Source #
Deprecated: Please use "unfoldMany readerRev" instead.
fromArrayStreamK :: (Unbox a, MonadIO m) => StreamK m (Array a) > m (Array a) Source #
Deprecated: Please use fromChunksK instead.
fromStreamDN :: forall m a. (MonadIO m, Unbox a) => Int > Stream m a > m (Array a) Source #
Deprecated: Please use fromStreamN instead.
fromStreamD :: forall m a. (MonadIO m, Unbox a) => Stream m a > m (Array a) Source #
Deprecated: Please use fromStream instead.
toStreamD :: forall m a. (Monad m, Unbox a) => Array a > Stream m a Source #
Deprecated: Please use read
instead.
toStreamDRev :: forall m a. (Monad m, Unbox a) => Array a > Stream m a Source #
Deprecated: Please use readRev
instead.
writeWith :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Deprecated: Please use createWith instead.
pinnedWriteN :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Deprecated: Please use pinnedCreateOf instead.
writeNUnsafe :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Deprecated: Please use unsafeCreateOf instead.
pinnedWriteNUnsafe :: forall m a. (MonadIO m, Unbox a) => Int > Fold m a (Array a) Source #
Deprecated: Please use unsafePinnedCreateOf instead.
pinnedWriteNAligned :: forall m a. (MonadIO m, Unbox a) => Int > Int > Fold m a (Array a) Source #
Deprecated: To be removed.
pinnedWriteNAligned alignment n
folds a maximum of n
elements from the input
stream to an Array
aligned to the given size.
Prerelease
pinnedWrite :: forall m a. (MonadIO m, Unbox a) => Fold m a (Array a) Source #
Deprecated: Please use pinnedCreate instead.
Construction
writeLastN :: (Storable a, Unbox a, MonadIO m) => Int > Fold m a (Array a) Source #
writeLastN n
folds a maximum of n
elements from the end of the input
stream to an Array
.
Random Access
getIndex :: forall a. Unbox a => Int > Array a > Maybe a Source #
O(1) Lookup the element at the given index. Index starts from 0.
getIndexRev :: forall a. Unbox a => Int > Array a > Maybe a Source #
Like getIndex
but indexes the array in reverse from the end.
Prerelease
indexReader :: (Monad m, Unbox a) => Stream m Int > Unfold m (Array a) a Source #
Given a stream of array indices, read the elements on those indices from the supplied Array. An exception is thrown if an index is out of bounds.
This is the most general operation. We can implement other operations in terms of this:
read = let u = lmap (arr > (0, length arr  1)) Unfold.enumerateFromTo in Unfold.lmap f (indexReader arr) readRev = let i = length arr  1 in Unfold.lmap f (indexReaderFromThenTo i (i  1) 0)
Prerelease
indexReaderFromThenTo :: Unfold m (Int, Int, Int, Array a) a Source #
Unfolds (from, then, to, array)
generating a finite stream whose first
element is the array value from the index from
and the successive elements
are from the indices in increments of then
up to to
. Index enumeration
can occur downwards or upwards depending on whether then
comes before or
after from
.
getIndicesFromThenTo = let f (from, next, to, arr) = (Stream.enumerateFromThenTo from next to, arr) in Unfold.lmap f getIndices
Unimplemented
Size
Search
binarySearch :: a > Array a > Maybe Int Source #
Given a sorted array, perform a binary search to find the given element. Returns the index of the element if found.
Unimplemented
indexFinder :: (a > Bool) > Unfold Identity (Array a) Int Source #
Perform a linear search to find all the indices where a given element is present in an array.
Unimplemented
Casting
cast :: forall a b. Unbox b => Array a > Maybe (Array b) Source #
Cast an array having elements of type a
into an array having elements of
type b
. The length of the array should be a multiple of the size of the
target element otherwise Nothing
is returned.
castUnsafe :: Array a > Array b Source #
Cast an array having elements of type a
into an array having elements of
type b
. The array size must be a multiple of the size of type b
otherwise accessing the last element of the array may result into a crash or
a random value.
Prerelease
asCStringUnsafe :: Array a > (CString > IO b) > IO b Source #
Convert an array of any type into a null terminated CString Ptr. If the array is unpinned it is first converted to a pinned array which requires a copy.
Unsafe
O(n) Time: (creates a copy of the array)
Prerelease
Subarrays
O(1) Slice an array in constant time.
Caution: The bounds of the slice are not checked.
Unsafe
Prerelease
:: forall m a. (Monad m, Unbox a)  
=> Int  from index 
> Int  length of the slice 
> Unfold m (Array a) (Array a) 
Generate a stream of slices of specified length from an array, starting from the supplied array index. The last slice may be shorter than the requested length.
Prerelease/
splitOn :: (Monad m, Unbox a) => (a > Bool) > Array a > Stream m (Array a) Source #
Split the array into a stream of slices using a predicate. The element matching the predicate is dropped.
Prerelease
Streaming Operations
streamTransform :: forall m a b. (MonadIO m, Unbox a, Unbox b) => (Stream m a > Stream m b) > Array a > m (Array b) Source #
Transform an array into another array using a stream transformation operation.
Prerelease
Folding
streamFold :: (Monad m, Unbox a) => (Stream m a > m b) > Array a > m b Source #
Fold an array using a stream fold operation.
Prerelease
fold :: forall m a b. (Monad m, Unbox a) => Fold m a b > Array a > m b Source #
Fold an array using a Fold
.
Prerelease
Stream of Arrays
interpose :: (Monad m, Unbox a) => a > Stream m (Array a) > Stream m a Source #
Insert the given element between arrays and flatten.
>>>
interpose x = Stream.interpose x Array.reader
interposeSuffix :: forall m a. (Monad m, Unbox a) => a > Stream m (Array a) > Stream m a Source #
Insert the given element after each array and flatten. This is similar to unlines.
>>>
interposeSuffix x = Stream.interposeSuffix x Array.reader
intercalateSuffix :: (Monad m, Unbox a) => Array a > Stream m (Array a) > Stream m a Source #
Insert the given array after each array and flatten.
>>>
intercalateSuffix = Stream.intercalateSuffix Array.reader
compactLE :: (MonadIO m, Unbox a) => Int > Stream m (Array a) > Stream m (Array a) Source #
compactLE n
coalesces adjacent arrays in the input stream
only if the combined size would be less than or equal to n.
Generates unpinned arrays irrespective of the pinning status of input arrays.
pinnedCompactLE :: (MonadIO m, Unbox a) => Int > Stream m (Array a) > Stream m (Array a) Source #
Pinned version of compactLE
.
compactOnByte :: MonadIO m => Word8 > Stream m (Array Word8) > Stream m (Array Word8) Source #
Split a stream of arrays on a given separator byte, dropping the separator and coalescing all the arrays between two separators into a single array.
compactOnByteSuffix :: MonadIO m => Word8 > Stream m (Array Word8) > Stream m (Array Word8) Source #
Like compactOnByte
considers the separator in suffix position instead of
infix position.
foldBreakChunks :: forall m a b. (MonadIO m, Unbox a) => Fold m a b > Stream m (Array a) > m (b, Stream m (Array a)) Source #
foldChunks :: (MonadIO m, Unbox a) => Fold m a b > Stream m (Array a) > m b Source #
Fold a stream of arrays using a Fold
. This is equivalent to the
following:
>>>
foldChunks f = Stream.fold f . Stream.unfoldMany Array.reader
foldBreakChunksK :: forall m a b. (MonadIO m, Unbox a) => Fold m a b > StreamK m (Array a) > m (b, StreamK m (Array a)) Source #
Fold a stream of arrays using a Fold
and return the remaining stream.
The following alternative to this function allows composing the fold using the parser Monad:
foldBreakStreamK f s = fmap (first (fromRight undefined)) $ StreamK.parseBreakChunks (ParserK.adaptC (Parser.fromFold f)) s
We can compare perf and remove this one or define it in terms of that.
parseBreakChunksK :: forall m a b. (MonadIO m, Unbox a) => Parser a m b > StreamK m (Array a) > m (Either ParseError b, StreamK m (Array a)) Source #
Parse an array stream using the supplied Parser
. Returns the parse
result and the unconsumed stream. Throws ParseError
if the parse fails.
The following alternative to this function allows composing the parser using the parser Monad:
>>>
parseBreakStreamK p = StreamK.parseBreakChunks (ParserK.adaptC p)
We can compare perf and remove this one or define it in terms of that.
Internal
Serialization
serialize :: Serialize a => a > Array Word8 Source #
Properties:
1. Identity: deserialize . serialize == id
2. Encoded equivalence: serialize a == serialize a
pinnedSerialize :: Serialize a => a > Array Word8 Source #
Serialize a Haskell type to a pinned byte array. The array is allocated using pinned memory so that it can be used directly in OS APIs for writing to file or sending over the network.
Properties:
1. Identity: deserialize . pinnedSerialize == id
2. Encoded equivalence: pinnedSerialize a == pinnedSerialize a
deserialize :: Serialize a => Array Word8 > a Source #
Decode a Haskell type from a byte array containing its serialized representation.
Deprecated
:: forall m a. (Monad m, Unbox a)  
=> Int  from index 
> Int  length of the slice 
> Unfold m (Array a) (Int, Int) 
Deprecated: Please use sliceIndexerFromLen instead.