Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 19818

Failed to find data source: kafka. Please deploy the application as per the deployment section of Structured Streaming + Kafka Integration Guide Spark

$
0
0

Hello I am trying to use pyspark + kafka in order to do this I execute this command in order to set up the kafka-cluster

  • Spark version is 3.5.0 | spark-3.5.0-bin-hadoop3
  • Kafka version is - kafka_2.13-3.6.0
  • pyspark version is 3.5.0

My code is

import osos.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"os.environ["SPARK_HOME"] = "/content/spark-3.5.0-bin-hadoop3"os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars /content/spark-streaming-kafka-0-10-assembly_2.13-3.5.0.jar--packages,org.apache.spark:spark-streaming-kafka-0-10_2.13:3.5.0,org.apache.spark:spark-sql-kafka-0-10_2.13:3.5.0 pyspark-shell'import findsparkfindspark.init()from pyspark.sql import SparkSessionfrom pyspark.sql.functions import explodefrom pyspark.sql.functions import splitspark = SparkSession \    .builder \    .appName("StructuredNetworkWordCount") \    .getOrCreate()df = spark \  .readStream \  .format("kafka") \  .option("kafka.bootstrap.servers", "localhost:9092") \  .option("subscribe", "c1") \  .option("includeHeaders", "true") \  .load()

This return the following error:

Failed to find data source: kafka. Please deploy the application as per the deployment section of Structured Streaming + Kafka Integration Guide.

I'm using Google Colab

I've try downgrade versions and try multiple old solutions from stack overflow


Viewing all articles
Browse latest Browse all 19818

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>