Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Failed to find data source: kafka. Please deploy the application as per the deployment section of Structured Streaming + Kafka Integration Guide Spark

$
0
0

Hello I am trying to use pyspark + kafka in order to do this I execute this command in order to set up the kafka-cluster

  • Spark version is 3.5.0 | spark-3.5.0-bin-hadoop3
  • Kafka version is - kafka_2.13-3.6.0
  • pyspark version is 3.5.0

My code is

import osos.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"os.environ["SPARK_HOME"] = "/content/spark-3.5.0-bin-hadoop3"os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars /content/spark-streaming-kafka-0-10-assembly_2.13-3.5.0.jar--packages,org.apache.spark:spark-streaming-kafka-0-10_2.13:3.5.0,org.apache.spark:spark-sql-kafka-0-10_2.13:3.5.0 pyspark-shell'import findsparkfindspark.init()from pyspark.sql import SparkSessionfrom pyspark.sql.functions import explodefrom pyspark.sql.functions import splitspark = SparkSession \    .builder \    .appName("StructuredNetworkWordCount") \    .getOrCreate()df = spark \  .readStream \  .format("kafka") \  .option("kafka.bootstrap.servers", "localhost:9092") \  .option("subscribe", "c1") \  .option("includeHeaders", "true") \  .load()

This return the following error:

Failed to find data source: kafka. Please deploy the application as per the deployment section of Structured Streaming + Kafka Integration Guide.

I'm using Google Colab

I've try downgrade versions and try multiple old solutions from stack overflow


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>