mmlspark.io.image package

Submodules

mmlspark.io.image.ImageUtils module

mmlspark.io.image.ImageUtils.readFromPaths(df, pathCol, imageCol='image')[source]

Reads images from a column of filenames

Parameters
  • df (DataFrame) – The DataFrame to be processed

  • pathCol (str) – The name of the column containing filenames

  • imageCol (str) – The name of the added column of images

Returns

The dataframe with loaded images

Return type

df

mmlspark.io.image.ImageUtils.readFromStrings(df, bytesCol, imageCol='image', dropPrefix=False)[source]

Reads images from a column of filenames

Parameters
  • df (DataFrame) – The DataFrame to be processed

  • pathCol (str) – The name of the column containing filenames

  • imageCol (str) – The name of the added column of images

Returns

The dataframe with loaded images

Return type

df

Module contents

MicrosoftML is a library of Python classes to interface with the Microsoft scala APIs to utilize Apache Spark to create distibuted machine learning models.

MicrosoftML simplifies training and scoring classifiers and regressors, as well as facilitating the creation of models using the CNTK library, images, and text.