导航菜单

数据分析与挖掘

Spark SQL分析

df.createOrReplaceTempView('users')
spark.sql('SELECT COUNT(*) FROM users').show()

机器学习与挖掘

from pyspark.ml.classification import LogisticRegression
lr = LogisticRegression()
model = lr.fit(df)

流式分析

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
ds = spark.readStream.format('kafka').option('subscribe', 'topic').load()