synapse.ml.io.http package
Submodules
synapse.ml.io.http.CustomInputParser module
- class synapse.ml.io.http.CustomInputParser.CustomInputParser(java_obj=None, inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
- getUdfPython()[source]
- Returns
User Defined Python Function to be applied to the DF input col
- Return type
udfPython
- getUdfScala()[source]
- Returns
User Defined Function to be applied to the DF input col
- Return type
udfScala
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setParams(inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]
Set the (keyword only) parameters
- setUdfPython(value)[source]
- Parameters
udfPython¶ – User Defined Python Function to be applied to the DF input col
- setUdfScala(value)[source]
- Parameters
udfScala¶ – User Defined Function to be applied to the DF input col
- udfPython = Param(parent='undefined', name='udfPython', doc='User Defined Python Function to be applied to the DF input col')
- udfScala = Param(parent='undefined', name='udfScala', doc='User Defined Function to be applied to the DF input col')
synapse.ml.io.http.CustomOutputParser module
- class synapse.ml.io.http.CustomOutputParser.CustomOutputParser(java_obj=None, inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
- getUdfPython()[source]
- Returns
User Defined Python Function to be applied to the DF input col
- Return type
udfPython
- getUdfScala()[source]
- Returns
User Defined Function to be applied to the DF input col
- Return type
udfScala
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setParams(inputCol=None, outputCol=None, udfPython=None, udfScala=None)[source]
Set the (keyword only) parameters
- setUdfPython(value)[source]
- Parameters
udfPython¶ – User Defined Python Function to be applied to the DF input col
- setUdfScala(value)[source]
- Parameters
udfScala¶ – User Defined Function to be applied to the DF input col
- udfPython = Param(parent='undefined', name='udfPython', doc='User Defined Python Function to be applied to the DF input col')
- udfScala = Param(parent='undefined', name='udfScala', doc='User Defined Function to be applied to the DF input col')
synapse.ml.io.http.HTTPFunctions module
synapse.ml.io.http.HTTPTransformer module
- class synapse.ml.io.http.HTTPTransformer.HTTPTransformer(java_obj=None, concurrency=1, concurrentTimeout=None, handler=None, inputCol=None, outputCol=None, timeout=60.0)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, handler=None, inputCol=None, outputCol=None, timeout=60.0)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
synapse.ml.io.http.JSONInputParser module
- class synapse.ml.io.http.JSONInputParser.JSONInputParser(java_obj=None, headers={}, inputCol=None, method='POST', outputCol=None, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
- headers = Param(parent='undefined', name='headers', doc='headers of the request')
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- method = Param(parent='undefined', name='method', doc='method to use for request, (PUT, POST, PATCH)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setParams(headers={}, inputCol=None, method='POST', outputCol=None, url=None)[source]
Set the (keyword only) parameters
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.io.http.JSONOutputParser module
synapse.ml.io.http.ServingFunctions module
synapse.ml.io.http.SimpleHTTPTransformer module
- class synapse.ml.io.http.SimpleHTTPTransformer.SimpleHTTPTransformer(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='SimpleHTTPTransformer_7da31ce14d3e_errors', flattenOutputBatches=None, handler=None, inputCol=None, inputParser=None, miniBatcher=None, outputCol=None, outputParser=None, timeout=60.0)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]
synapse.ml.io.http.StringOutputParser module
- class synapse.ml.io.http.StringOutputParser.StringOutputParser(java_obj=None, inputCol=None, outputCol=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.