synapse.ml.io.http package

Submodules

synapse.ml.io.http.CustomInputParser module

class synapse.ml.io.http.CustomInputParser.CustomInputParser(java_obj=None, inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • inputCol (str) – The name of the input column

  • outputCol (str) – The name of the output column

  • udfPython (object) – User Defined Python Function to be applied to the DF input col

  • udfScala (object) – User Defined Function to be applied to the DF input col

getInputCol()[source]
Returns:

The name of the input column

Return type:

inputCol

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getUdfPython()[source]
Returns:

User Defined Python Function to be applied to the DF input col

Return type:

udfPython

getUdfScala()[source]
Returns:

User Defined Function to be applied to the DF input col

Return type:

udfScala

inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setInputCol(value)[source]
Parameters:

inputCol – The name of the input column

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]

Set the (keyword only) parameters

setUdfPython(value)[source]
Parameters:

udfPython – User Defined Python Function to be applied to the DF input col

setUdfScala(value)[source]
Parameters:

udfScala – User Defined Function to be applied to the DF input col

udfPython = Param(parent='undefined', name='udfPython', doc='User Defined Python Function to be applied to the DF input col')
udfScala = Param(parent='undefined', name='udfScala', doc='User Defined Function to be applied to the DF input col')

synapse.ml.io.http.CustomOutputParser module

class synapse.ml.io.http.CustomOutputParser.CustomOutputParser(java_obj=None, inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • inputCol (str) – The name of the input column

  • outputCol (str) – The name of the output column

  • udfPython (object) – User Defined Python Function to be applied to the DF input col

  • udfScala (object) – User Defined Function to be applied to the DF input col

getInputCol()[source]
Returns:

The name of the input column

Return type:

inputCol

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getUdfPython()[source]
Returns:

User Defined Python Function to be applied to the DF input col

Return type:

udfPython

getUdfScala()[source]
Returns:

User Defined Function to be applied to the DF input col

Return type:

udfScala

inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setInputCol(value)[source]
Parameters:

inputCol – The name of the input column

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]

Set the (keyword only) parameters

setUdfPython(value)[source]
Parameters:

udfPython – User Defined Python Function to be applied to the DF input col

setUdfScala(value)[source]
Parameters:

udfScala – User Defined Function to be applied to the DF input col

udfPython = Param(parent='undefined', name='udfPython', doc='User Defined Python Function to be applied to the DF input col')
udfScala = Param(parent='undefined', name='udfScala', doc='User Defined Function to be applied to the DF input col')

synapse.ml.io.http.HTTPFunctions module

synapse.ml.io.http.HTTPFunctions.http_udf(func)[source]
synapse.ml.io.http.HTTPFunctions.requests_to_spark(p)[source]

synapse.ml.io.http.HTTPTransformer module

class synapse.ml.io.http.HTTPTransformer.HTTPTransformer(java_obj=None, concurrency=1, concurrentTimeout=None, handler=None, inputCol=None, outputCol=None, timeout=60.0)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • handler (object) – Which strategy to use when handling requests

  • inputCol (str) – The name of the input column

  • outputCol (str) – The name of the output column

  • timeout (float) – number of seconds to wait before closing the connection

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

getInputCol()[source]
Returns:

The name of the input column

Return type:

inputCol

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setInputCol(value)[source]
Parameters:

inputCol – The name of the input column

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(concurrency=1, concurrentTimeout=None, handler=None, inputCol=None, outputCol=None, timeout=60.0)[source]

Set the (keyword only) parameters

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

synapse.ml.io.http.JSONInputParser module

class synapse.ml.io.http.JSONInputParser.JSONInputParser(java_obj=None, headers={}, inputCol=None, method='POST', outputCol=None, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • headers (dict) – headers of the request

  • inputCol (str) – The name of the input column

  • method (str) – method to use for request, (PUT, POST, PATCH)

  • outputCol (str) – The name of the output column

  • url (str) – Url of the service

getHeaders()[source]
Returns:

headers of the request

Return type:

headers

getInputCol()[source]
Returns:

The name of the input column

Return type:

inputCol

static getJavaPackage()[source]

Returns package name String.

getMethod()[source]
Returns:

method to use for request, (PUT, POST, PATCH)

Return type:

method

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getUrl()[source]
Returns:

Url of the service

Return type:

url

headers = Param(parent='undefined', name='headers', doc='headers of the request')
inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
method = Param(parent='undefined', name='method', doc='method to use for request, (PUT, POST, PATCH)')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setHeaders(value)[source]
Parameters:

headers – headers of the request

setInputCol(value)[source]
Parameters:

inputCol – The name of the input column

setMethod(value)[source]
Parameters:

method – method to use for request, (PUT, POST, PATCH)

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(headers={}, inputCol=None, method='POST', outputCol=None, url=None)[source]

Set the (keyword only) parameters

setUrl(value)[source]
Parameters:

url – Url of the service

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.io.http.JSONOutputParser module

class synapse.ml.io.http.JSONOutputParser.JSONOutputParser(java_obj=None, dataType=None, inputCol=None, outputCol=None, postProcessor=None)[source]

Bases: _JSONOutputParser

getDataType()[source]
Returns:

format to parse the column to

Return type:

dataType

setDataType(value)[source]
Parameters:

dataType – format to parse the column to

synapse.ml.io.http.ServingFunctions module

synapse.ml.io.http.ServingFunctions.request_to_string(c)[source]
synapse.ml.io.http.ServingFunctions.string_to_response(c)[source]

synapse.ml.io.http.SimpleHTTPTransformer module

class synapse.ml.io.http.SimpleHTTPTransformer.SimpleHTTPTransformer(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='SimpleHTTPTransformer_ebd2e9bf6311_errors', flattenOutputBatches=None, handler=None, inputCol=None, inputParser=None, miniBatcher=None, outputCol=None, outputParser=None, timeout=60.0)[source]

Bases: _SimpleHTTPTransformer

setUrl(value)[source]

synapse.ml.io.http.StringOutputParser module

class synapse.ml.io.http.StringOutputParser.StringOutputParser(java_obj=None, inputCol=None, outputCol=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • inputCol (str) – The name of the input column

  • outputCol (str) – The name of the output column

getInputCol()[source]
Returns:

The name of the input column

Return type:

inputCol

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setInputCol(value)[source]
Parameters:

inputCol – The name of the input column

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(inputCol=None, outputCol=None)[source]

Set the (keyword only) parameters

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.