Packages

o

com.microsoft.azure.synapse.ml.io.binary

BinaryFileReader

object BinaryFileReader

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BinaryFileReader
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def read(path: String, recursive: Boolean, spark: SparkSession, sampleRatio: Double = 1, inspectZip: Boolean = true, seed: Long = 0L): DataFrame

    Read the directory of binary files from the local or remote source

    Read the directory of binary files from the local or remote source

    path

    Path to the directory

    recursive

    Recursive search flag

    returns

    DataFrame with a single column of "binaryFiles", see "columnSchema" for details

  2. def readFromPaths(df: DataFrame, pathCol: String, bytesCol: String, concurrency: Int, timeout: Int): DataFrame

    df

    the dataframe containing the paths

    pathCol

    the column name of the paths to read

    bytesCol

    the column name of the resulting bytes column

    concurrency

    the number of concurrent reads

    timeout

    in milliseconds

  3. def recursePath(fileSystem: FileSystem, path: Path, pathFilter: (FileStatus) ⇒ Boolean): Array[Path]
  4. def stream(path: String, spark: SparkSession, sampleRatio: Double = 1, inspectZip: Boolean = true, seed: Long = 0L): DataFrame

    Read the directory of binary files from the local or remote source

    Read the directory of binary files from the local or remote source

    path

    Path to the directory

    returns

    DataFrame with a single column of "binaryFiles", see "columnSchema" for details