com.microsoft.ml.spark.core.env.StreamUtilities
Iterate through the entries of a streamed .zip file, selecting only sampleRatio of them
Stream of zip file
File name is only used to construct the names of the entries
What fraction of files is returned from zip
(Changed in version 2.8.0) collect has changed. The previous behavior can be reproduced with toSeq.
collect
toSeq
Iterate through the entries of a streamed .zip file, selecting only sampleRatio of them