rpatid10

rpatid10

Member Since 6 months ago

Bangalore

Experience Points
0
follower
Lessons Completed
11
follow
Lessons Completed
3
stars
Best Reply Awards
6
repos

23 contributions in the last year

Pinned
⚡ Apache Pinot (Incubating) - A realtime distributed OLAP datastore
⚡ Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
Activity
Oct
27
1 month ago
Activity icon
created branch

rpatid10 in rpatid10/Marquez create branch main

createdAt 1 month ago
Activity icon
created repository
createdAt 1 month ago
Oct
7
1 month ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

Oct
4
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

Hi @abhishekd0907 ,

I am able to Remove this warning I was getting this warning because my application batch interval was 4 min which was less than the default analysis Interval of StreamingLense i.e. 5 min.so currentTime - lastAnalyzedTimeMills >= streamingLensConfig.analysisIntervalMinutes * 60 * 1000 this condition was giving false as boolean value and 'logWarning(s"Streaming Lens failed " + e.getMessage)' and this line was giving warning.

I was debugging the issue and observed below points.

  1. In QueryInsightsManager.scala.

if (insights.streamingCriticalPathResults.streamingQueryState.equals(StreamingState.ERROR)) { throw new SparkException("Unexpected Error or Timeout occurred during Analysis") } streamingLensResultsBuffer.enqueue(insights) eventsReporter.foreach(_.sendEvent()) } }

this value always will be insights.streamingCriticalPathResults.streamingQueryState 'NONEWBATCHES' as in StreamingCriticalPathResults only 'NONEWBATCHES' state specific code is available. and it is always giving no new batch and taking expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES) as default values from StreamingCriticalPathResults.

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES) private def startStreamingAnalysis(queryProgress: QueryProgress): Unit = { val currentTime = System.currentTimeMillis() if (shouldTriggerAnalysis(currentTime)) { val insights = streamingQueryAnalyzer.analyze(queryProgress) logResultsIfNecessary(insights) println("Insides form startStreamingAnalysis method of QueryInsightManager" +insights) lastAnalyzedTimeMills = currentTime println("insights.streamingCriticalPathResults.streamingQueryState value : " +insights.streamingCriticalPathResults.streamingQueryState) if (insights.streamingCriticalPathResults.streamingQueryState.equals(StreamingState.ERROR)) { throw new SparkException("Unexpected Error or Timeout occurred during Analysis") } streamingLensResultsBuffer.enqueue(insights) eventsReporter.foreach(_.sendEvent()) } }**

This Block is always taking Values from StreamingCriticalPathResults.streamingQueryState which is NONEWBATCHES.

Kindly see the below logs and Suggest.

QueryInsightsManager.analysis block results:

value of queryProgress : QueryProgress(22,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:07:43.483Z,3,0.22653477308766895) lastProgress.batchId=22 lastProgress.id=12900529-7063-4446-a4b8-b58a94194e89 lastProgress.timestamp=2021-10-04T21:07:43.483Z lastProgress.numInputRows=3 lastProgress.processedRowsPerSecond=0.22653477308766895 currentTime : 1633381676786 analysisIntervalMinutes time: 120000 value of default interval in milliseconds this should be <= application batch interval time in milliseconds : 300000

StreamingQueryAnalyzer.scala Results : queryProgress : QueryProgress(22,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:07:43.483Z,3,0.22653477308766895) lastAnalyzedBatchId : -1 batchStartAndEndTimes : (1633381663483,1633381676726) batchStartAndEndTimes._1 : 1633381663483 batchStartAndEndTimes._2 : 1633381676726 batchRunningTime : 13243 batchDescription : BatchDescription(12900529-7063-4446-a4b8-b58a94194e89,22)

QueryInsightsManager.analysisTask() results:

value of queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:10:00.000Z,0,0.0) lastProgress.batchId=23 lastProgress.id=12900529-7063-4446-a4b8-b58a94194e89 lastProgress.timestamp=2021-10-04T21:10:00.000Z lastProgress.numInputRows=0 lastProgress.processedRowsPerSecond=0.0 currentTime : 1633381800115 analysisIntervalMinutes time: 120000 value of default interval in milliseconds this should be <= application batch interval time in milliseconds : 300000

StreamingQueryAnalyzer.scala Results : queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:10:00.000Z,0,0.0) lastAnalyzedBatchId : -1 batchStartAndEndTimes : (1633381800000,0) batchStartAndEndTimes._1 : 1633381800000 batchStartAndEndTimes._2 : 0 insightsStreamingLensResults(23,0,StreamingCriticalPathResults(120000,0,0,NONEWBATCHES)) Hello I am Inside QueryInsightsManager.scala Try Block Streaming Query Analyzer Results : queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:10:00.000Z,0,0.0) lastAnalyzedBatchId : -1 batchStartAndEndTimes : (1633381800000,0) batchStartAndEndTimes._1 : 1633381800000 batchStartAndEndTimes._2 : 0

Insides form startStreamingAnalysis method of QueryInsightManagerStreamingLensResults(23,0,StreamingCriticalPathResults(120000,0,0,NONEWBATCHES)) insights.streamingCriticalPathResults.streamingQueryState value : NONEWBATCHES

value of queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:15:00.000Z,0,0.0)

lastProgress.batchId=23 lastProgress.id=12900529-7063-4446-a4b8-b58a94194e89 lastProgress.timestamp=2021-10-04T21:15:00.000Z lastProgress.numInputRows=0 lastProgress.processedRowsPerSecond=0.0 currentTime : 1633382100114 lastAnalyzedTimeMills: 1633381800123 analysisIntervalMinutes time: 120000 value of default interval in milliseconds this should be <= application batch interval time in milliseconds : 300000

StreamingQueryAnalyzer.scala Results : Results : queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:15:00.000Z,0,0.0) lastAnalyzedBatchId : -1 batchStartAndEndTimes : (1633382100000,0) batchStartAndEndTimes._1 : 1633382100000 batchStartAndEndTimes._2 : 0 insightsStreamingLensResults(23,0,StreamingCriticalPathResults(120000,0,0,NONEWBATCHES)) Hello I am Inside Try Block Streaming Query Analyzer Results : queryProgress : QueryProgress(23,12900529-7063-4446-a4b8-b58a94194e89,2021-10-04T21:15:00.000Z,0,0.0) lastAnalyzedBatchId : -1 batchStartAndEndTimes : (1633382100000,0) batchStartAndEndTimes._1 : 1633382100000 batchStartAndEndTimes._2 : 0

Insides form startStreamingAnalysis method of QueryInsightManagerStreamingLensResults(23,0,StreamingCriticalPathResults(120000,0,0,NONEWBATCHES)) insights.streamingCriticalPathResults.streamingQueryState value : NONEWBATCHES

Activity icon
fork

rpatid10 forked AvianshKumar/Ril-Marquez

rpatid10 Updated
fork time in 2 months ago
Oct
3
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

@abhishekd0907 Okay ,So there Is no other way to Use StreamingLens with Spark 2.2 ryt? or is there any workaround we can do to use StreamingLens with Spark 2.2.?

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

Streaminglens was build and tested with Spark 2.4 Application is using Spark 2.2 There has been change in some internal APIs between Spark 2.2 and Spark 2.4 which Streaminglens uses, so present code is not working with Spark 2.2 and leading to following error.

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-5d353d6e4bc2,2)

@abhishekd0907 but this warning is not showing when I have removed the checkpoint location.all other details are same when I am removing checkpoint Location .I am attaching Fresh Location with new CheckPoint Location. Kindly Suggest. log2.txt

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

@abhishekd0907 but this warning is not showing when I have removed the checkpoint location.all other details are same when I am removing checkpoint Location log2.txt

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

@abhishekd0907 I Have attached the Log.Kindly Check Logs.txt

@abhishekd0907 let me know if I need to share any other logs also.

Thanks.

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10

@abhishekd0907 let me know if I need to share any other logs also.

Thanks.

started
started time in 2 months ago
started
started time in 2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

rpatid10
rpatid10
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 @akumarb2010 @itsvikramagr @Indu-sharma @akumarb2010 @iamrohit @beriaanirudh @mayurdb @michaelmior @rishitesh @emlyn @vrajat @fdemesmaeker @indit-qubole Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

Also in com.qubole.spark.streaminglens.common.results AggregateStateResults.scala below code is available.

**package com.qubole.spark.streaminglens.common.results

case class AggregateStateResults(state: String = "NO NEW BATCHES",

                         recommendation: String = "Streaming Query State: NO NEW BATCHES<br>")**

KIndly Suggest.

Oct
2
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

21/10/01 15:50:04 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(e68c3c2c-6d5f-469e-864a-)

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @shubhamtagra @jsensarma @mjose007 Kindly Suggest.

Kindly Guide if is there anything needs to change here.

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/AggregateStateResults.scala

https://github.com/qubole/streaminglens/blob/master/src/main/scala/com/qubole/spark/streaminglens/common/results/StreamingCriticalPathResults.scala

As in Project (com.qubole.spark.streaminglens.QueryInsightsManager) below code is available to fetch the insights.

| |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| | BatchId: ${results.batchId} | Analysis Time: ${pd(results.analysisTime)} | Expected Micro Batch SLA: ${pd(results.streamingCriticalPathResults.expectedMicroBatchSLA)} | Batch Running Time: ${pd(results.streamingCriticalPathResults.batchRunningTime)} | Critical Time: ${pd(results.streamingCriticalPathResults.criticalTime)} | Streaming Query State: ${results.streamingCriticalPathResults.streamingQueryState.toString} | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| """.stripMargin)

Here we are taking all the details from streamingCriticalPathResults and here only code available for NONEWBATCH State

case class StreamingCriticalPathResults(expectedMicroBatchSLA: Long = 0, batchRunningTime: Long = 0, criticalTime: Long = 0, streamingQueryState: StreamingState.Value = StreamingState.NONEWBATCHES)

KIndly Suggest.

Kindly Suggest.

rpatid10
rpatid10

Someone kindly help. Or suggest if is thr any other support channel is available for the same. I.e. slack channel.

Activity icon
fork

rpatid10 forked qubole/streaminglens

⚡ Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
rpatid10 Apache License 2.0 Updated
fork time in 2 months ago
Oct
1
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

Not able to run streaminglens in intellij idea

I am running my code locally in intellij idea with sreaming lens maven dependancy . I am getting below error . No output , let me know what i am doing wrong here


package com.manu.sstreaming;

import com.qubole.spark.streaminglens.StreamingLens;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.streaming.StreamingQuery;

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;

import org.apache.spark.sql.streaming.Trigger;
import scala.Predef;
import scala.collection.JavaConversions.*;
import scala.collection.JavaConverters;
import scala.collection.Seq;


/**
 * @author Manu Jose
 * create on : 16/04/20
 */
public class SStreamingNC {
    public static void main(String[] args) throws Exception {


        String host = "localhost";
        int port = 9999;
        //int port = Integer.parseInt(args[0]);


        SparkSession spark = SparkSession
                .builder()
                .appName("JavaStructuredNetworkWordCount")
                .master("local")
                .getOrCreate();


        Map<String, String> options = new HashMap<>();
        options.put("streamingLens.reporter.intervalMinutes", "1");

        scala.collection.immutable.Map<String, String> scalaMap = JavaConverters.mapAsScalaMapConverter(options).asScala().toMap(
                Predef.conforms());
        StreamingLens streamingLens = new StreamingLens(spark, scalaMap);
        streamingLens.registerListeners();




        // Create DataFrame representing the stream of input lines from connection to host:port
        spark.sql("SET spark.sql.streaming.metricsEnabled=true");
        Dataset<Row> lines = spark
                .readStream()
                .format("socket")
                .option("host", host)
                .option("port", port)
                .load();

        // Split the lines into words
        Dataset<String> words = lines.as(Encoders.STRING()).flatMap(
                (FlatMapFunction<String, String>) x -> Arrays.asList(x.split(" ")).iterator(),
                Encoders.STRING());

        // Generate running word count
        Dataset<Row> wordCounts = words.groupBy("value").count();


        // Start running the query that prints the running counts to the console
        StreamingQuery query = wordCounts.writeStream()
                .outputMode("update")
                .format("console")
                .queryName("Query_name")
                .trigger(Trigger.ProcessingTime(2 * 1000))
                .start();

       spark.streams().awaitAnyTermination();

    }

}
20/05/01 01:06:10 INFO StateStore: Getting StateStoreCoordinatorRef
20/05/01 01:06:10 INFO StateStore: Retrieved reference to StateStoreCoordinator: org.apac[email protected]11ddc5d8
20/05/01 01:06:10 INFO StateStore: Env is not null
20/05/01 01:06:10 INFO StateStore: Getting StateStoreCoordinatorRef
20/05/01 01:06:10 INFO StateStore: Retrieved reference to StateStoreCoordinator: org.apac[email protected]c1c5bf
20/05/01 01:06:10 INFO StateStore: Env is not null
20/05/01 01:06:10 INFO StateStore: Getting StateStoreCoordinatorRef
20/05/01 01:06:10 INFO StateStore: Retrieved reference to StateStoreCoordinator: org.apac[email protected]62726145

rpatid10
rpatid10

@mjose007 Kindly suggest https://github.com/qubole/streaminglens/issues/5.

For me state is always showing same. NONEWBATCH

Activity icon
issue

rpatid10 issue qubole/streaminglens

rpatid10
rpatid10

Streaming Lens failed key not found: BatchDescription

Can Someone suggest what is the issue here I am getting this logs while executing spark-structure-streaming Application with StreamingLence.Its Not generating Any Recommendation and state.

21/10/01 11:30:07 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(62618e99-5175-4106-90f1-0b2ef0b49fda,2) 21/10/01 11:30:07 INFO QueryInsightsManager: Max retries reached. Attempting to stop StreamingLens 21/10/01 11:30:07 INFO QueryInsightsManager: Successfully shutdown StreamingLens

Spark Submit Command.

spark Submit Command:

spark-submit --verbose --name SparkStreamingLens --num-executors 1 --conf streamingLens.reporter.intervalMinutes=1 --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar, /home/abc/jars/kafka-clients-0.10.2.1.jar, --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G --supervise --class com.data.datalake.SparkStreamingLens /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

Kindly help.

Activity icon
issue

rpatid10 issue qubole/streaminglens

rpatid10
rpatid10

Streaming Lens failed key not found: BatchDescription

Can Someone suggest what is the issue here I am getting this logs while executing spark-structure-streaming Application with StreamingLence.Its Not generating Any Recommendation and state.

21/10/01 11:30:07 WARN QueryInsightsManager: Streaming Lens failed key not found: BatchDescription(62618e99-5175-4106-90f1-0b2ef0b49fda,2) 21/10/01 11:30:07 INFO QueryInsightsManager: Max retries reached. Attempting to stop StreamingLens 21/10/01 11:30:07 INFO QueryInsightsManager: Successfully shutdown StreamingLens @abhishekd0907 @Qubole @Lal [email protected] @Raunaq Morarka

Kindly help.

Sep
30
2 months ago
Activity icon
issue

rpatid10 issue qubole/sparklens

rpatid10
rpatid10

Not able to see the sparklens.Json File at mentioned Location

Hi I have recently started Learning SparkLens and trying to generate one sample json file for my SparkApplication and I'm using below spark Submit Command.

spark-submit --conf spark.extraListeners=com.qubole.sparklens.QuboleJobListener --conf spark.sparklens.reporting.disabled=true --conf spark.sparklens.data.dir=/home/data/sparklens.json --num-executors 1 --jars /home/data/SparkLensJar/sparklens-0.1.2-s_2.11.jar --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 1G --executor-cores 1 --executor-memory 1G --supervise --class com.spark.data.Sparklens /home/data/SparkLensJar/data-spark-utility_2.11-1.0.jar

I am able to see report/Matrics in Log but Its not creating any Json file at mentioned Location.I have also tried Hdfs location /user/username instead of localFilePath and still not able to see the Json File.

Activity icon
issue

rpatid10 issue qubole/sparklens

rpatid10
rpatid10

Implementation Of StreamingLens in Existing Spark Streaming Applications

I am trying to Implement StreamingLense In Spark Application.I Have added below 2 lines in existing code as suggested here. https://github.com/qubole/streaminglens

Screenshot 2021-09-26 at 12 38 24 PM

    1. class StreamingLens_POC(spark: SparkSession, options: RequestBuilder){}
    2. val streamingLens = new StreamingLens_POC(spark, options) 

    // Added New Block For StreamingLense
    class StreamingLens_POC(spark: SparkSession, options: RequestBuilder)

   // Existing Code which was working fine without any issue.
    object StreamingLens_POC {
    def main(args: Array[String]): Unit = {
    val applicationName = args(0) 
    val spark = SparkSession
   .builder()
   .appName(applicationName)
   //.config("spark.master", "local") //Addition code to execute in local
   .getOrCreate()
  println("Spark Streaming Lens POC Program Started")
  val streamingLens = new StreamingLens_POC(spark, options)   // added this new line for StreamingLense
 //..... existing code Code....
..
..
..
..
}

After that When I am trying to execute this application on server using below spark submit Command.

    spark-submit \
    --name SPARK_STREAMING_POC \
    --num-executors 1 \
    --jars  /home/username/jar/spark-streaminglens_2.11-0.5.3.jar , /home/username/jar/logstash-gelf-1.3.1.jar, ..(other required jar) \
    --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G \
    --supervise --class com.pkg.data.StreamingLens_POC /home/username/jar/PrjectJarName.jar \
    SPARK_STREAMING_POC

But Its Giving Below Error.

     21/09/24 11:50:26 ERROR ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: biz.paluch.logging.gelf.log4j.GelfLogAppender.setAdditionalFieldTypes(Ljava/lang/String;)V
    java.lang.NoSuchMethodError: biz.paluch.logging.gelf.log4j.GelfLogAppender.setAdditionalFieldTypes(Ljava/lang/String;)V

Can someone Kindly Suggest. If I need to do any addition Task here.

Activity icon
issue

rpatid10 issue qubole/sparklens

rpatid10
rpatid10

Not Able to See StreamingLens Report In Logs.

Hi,

I am trying to Implement StreamingLens In My existing Streaming Application. My application Is working fine and its loading data from one kafka topic to anothe kafka topic. But in ambari I am not able to see StreamingLens Reports, when I did this for batch application using sparklence i could see the logs generated by sparklense with all the resources information but same report I am not able to see for streaming application can someone suggest if I need to do additional code or where should I check the reports which should generate by StreamingLens.

My Sample code

      class SparkStreamingLens(spark: SparkSession, options: RequestBuilder)
      object SparkStreamingLens {
      def main(args: Array[String]): Unit = {
      println(" Spark Parameters are :")
      val igniteDetails = args(0)  
      val applicationName = args(1)
      val argumentTable = args(2)
      // options.addParameter("streamingLens.reporter.intervalMinutes", "1")
      val spark = SparkSession
      .builder()
      .appName(applicationName)
      .getOrCreate()
      val streamingLens = new SparkStreamingLens(spark, options)
      // Remaining Code to Read from Kafka and write Into Kafka(Streaming Data)
      }
      }

  spark-submit Command:

  spark-submit \
  --verbose \
  --name SparkStreamingLens \
  --num-executors 1  \
  --conf streamingLens.reporter.intervalMinutes=1  \
  --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,\
 /home/abc/jars/kafka-clients-0.10.2.1.jar,\
  --master yarn \
  --deploy-mode cluster \
  --driver-cores 1  --driver-memory 2G  --executor-cores 1  --executor-memory 2G \
  --supervise --class com.jpb.datalake.SparkStreamingLens \
 /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar  \
 "jdbc:ignite:thin://00.000.00.00:00000;distributedJoins=true;user=aaaaaa;password=aaaaaaa;"  \
 SparkStreamingLens \
 argumentTable
Activity icon
issue

rpatid10 issue qubole/streaminglens

rpatid10
rpatid10

StreamingLens Insights always showing "Streaming Query State: NONEWBATCHES" in Logs.

Hi All,

I am using StreamingLens in my spark application and loading data from (Kafka to Kafka). I am continuously sending data in source kafka topic and same is loading in destination topic but it's always showing same logs whenever I am pushing new data in source kafka topic BatchId is getting updated but Streaming Query State: NONEWBATCHES remains same. can someone suggest why the State and recommendations are not updating in logs.

|||||||||||||||||| StreamingLens Insights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Spark Submit Command:

spark-submit
--verbose
--name SparkStreamingLens
--num-executors 1
--conf streamingLens.reporter.intervalMinutes=1
--jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,
/home/abc/jars/kafka-clients-0.10.2.1.jar,
--master yarn
--deploy-mode cluster
--driver-cores 1 --driver-memory 2G --executor-cores 1 --executor-memory 2G
--supervise --class com.data.datalake.SparkStreamingLens
/home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar

@abhishekd0907 @itsvikramagr @mayankahuja @shubhamtagra @jsensarma @ciso-drew

Activity icon
issue

rpatid10 issue qubole/streaminglens

rpatid10
rpatid10

Not able to see recommendation for StreamingLens in Logs?

Hi,

I am trying to Implement StreamingLens In My existing Streaming Application. My application Is working fine and its loading data from one kafka topic to anothe kafka topic. But in ambari I am not able to see StreamingLens Reports, when I did this for batch application using sparklence i could see the logs generated by sparklense with all the resources information but same report I am not able to see for streaming application can someone suggest if I need to do additional code or where should I check the reports which should generate by StreamingLens.

My Sample code

      class SparkStreamingLens(spark: SparkSession, options: RequestBuilder)
      object SparkStreamingLens {
      def main(args: Array[String]): Unit = {
      println(" Spark Parameters are :")
      val igniteDetails = args(0)  
      val applicationName = args(1)
      val argumentTable = args(2)
      // options.addParameter("streamingLens.reporter.intervalMinutes", "1")
      val spark = SparkSession
      .builder()
      .appName(applicationName)
      .getOrCreate()
      val streamingLens = new SparkStreamingLens(spark, options)
      // Remaining Code to Read from Kafka and write Into Kafka(Streaming Data)
      }
      }

  spark-submit Command:

  spark-submit \
  --verbose \
  --name SparkStreamingLens \
  --num-executors 1  \
  --conf streamingLens.reporter.intervalMinutes=1  \
  --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,\
 /home/abc/jars/kafka-clients-0.10.2.1.jar,\
  --master yarn \
  --deploy-mode cluster \
  --driver-cores 1  --driver-memory 2G  --executor-cores 1  --executor-memory 2G \
  --supervise --class com.jpb.datalake.SparkStreamingLens \
 /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar  \
 "jdbc:ignite:thin://00.000.00.00:00000;distributedJoins=true;user=aaaaaa;password=aaaaaaa;"  \
 SparkStreamingLens \
 argumentTable
Sep
29
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

Not able to see recommendation for StreamingLens in Logs?

Hi,

I am trying to Implement StreamingLens In My existing Streaming Application. My application Is working fine and its loading data from one kafka topic to anothe kafka topic. But in ambari I am not able to see StreamingLens Reports, when I did this for batch application using sparklence i could see the logs generated by sparklense with all the resources information but same report I am not able to see for streaming application can someone suggest if I need to do additional code or where should I check the reports which should generate by StreamingLens.

My Sample code

      class SparkStreamingLens(spark: SparkSession, options: RequestBuilder)
      object SparkStreamingLens {
      def main(args: Array[String]): Unit = {
      println(" Spark Parameters are :")
      val igniteDetails = args(0)  
      val applicationName = args(1)
      val argumentTable = args(2)
      // options.addParameter("streamingLens.reporter.intervalMinutes", "1")
      val spark = SparkSession
      .builder()
      .appName(applicationName)
      .getOrCreate()
      val streamingLens = new SparkStreamingLens(spark, options)
      // Remaining Code to Read from Kafka and write Into Kafka(Streaming Data)
      }
      }

  spark-submit Command:

  spark-submit \
  --verbose \
  --name SparkStreamingLens \
  --num-executors 1  \
  --conf streamingLens.reporter.intervalMinutes=1  \
  --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,\
 /home/abc/jars/kafka-clients-0.10.2.1.jar,\
  --master yarn \
  --deploy-mode cluster \
  --driver-cores 1  --driver-memory 2G  --executor-cores 1  --executor-memory 2G \
  --supervise --class com.jpb.datalake.SparkStreamingLens \
 /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar  \
 "jdbc:ignite:thin://00.000.00.00:00000;distributedJoins=true;user=aaaaaa;password=aaaaaaa;"  \
 SparkStreamingLens \
 argumentTable
rpatid10
rpatid10

@abhishekd0907 Thanks a lot . It worked and I am able to see the logs. 21/09/29 17:38:42 INFO QueryInsightsManager: |||||||||||||||||| StreamingLens Inisights ||||||||||||||||||||||||| BatchId: 344 Analysis Time: 00s 000ms Expected Micro Batch SLA: 120s 000ms Batch Running Time: 00s 000ms Critical Time: 00s 000ms Streaming Query State: NONEWBATCHES ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

I am getting this info from logs related to StreamingLens. but not able to understand how to use this information for better resource allocation. In batch I was giving more Information related to cluster and application level resource Utilization.

kindly suggest if need to do any other changes or if is there any other logs which i need to check.

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

Not able to see recommendation for StreamingLens in Logs?

Hi,

I am trying to Implement StreamingLens In My existing Streaming Application. My application Is working fine and its loading data from one kafka topic to anothe kafka topic. But in ambari I am not able to see StreamingLens Reports, when I did this for batch application using sparklence i could see the logs generated by sparklense with all the resources information but same report I am not able to see for streaming application can someone suggest if I need to do additional code or where should I check the reports which should generate by StreamingLens.

My Sample code

      class SparkStreamingLens(spark: SparkSession, options: RequestBuilder)
      object SparkStreamingLens {
      def main(args: Array[String]): Unit = {
      println(" Spark Parameters are :")
      val igniteDetails = args(0)  
      val applicationName = args(1)
      val argumentTable = args(2)
      // options.addParameter("streamingLens.reporter.intervalMinutes", "1")
      val spark = SparkSession
      .builder()
      .appName(applicationName)
      .getOrCreate()
      val streamingLens = new SparkStreamingLens(spark, options)
      // Remaining Code to Read from Kafka and write Into Kafka(Streaming Data)
      }
      }

  spark-submit Command:

  spark-submit \
  --verbose \
  --name SparkStreamingLens \
  --num-executors 1  \
  --conf streamingLens.reporter.intervalMinutes=1  \
  --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,\
 /home/abc/jars/kafka-clients-0.10.2.1.jar,\
  --master yarn \
  --deploy-mode cluster \
  --driver-cores 1  --driver-memory 2G  --executor-cores 1  --executor-memory 2G \
  --supervise --class com.jpb.datalake.SparkStreamingLens \
 /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar  \
 "jdbc:ignite:thin://00.000.00.00:00000;distributedJoins=true;user=aaaaaa;password=aaaaaaa;"  \
 SparkStreamingLens \
 argumentTable
rpatid10
rpatid10

Can you share your Spark driver logs?

You need to add below line to activate it

val streamingLens = new StreamingLens(sparkSession, options)

Yes I have added this line @abhishekd0907 val streamingLens = new SparkStreamingLens(spark, options)

Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

Not able to see recommendation for StreamingLens in Logs?

Hi,

I am trying to Implement StreamingLens In My existing Streaming Application. My application Is working fine and its loading data from one kafka topic to anothe kafka topic. But in ambari I am not able to see StreamingLens Reports, when I did this for batch application using sparklence i could see the logs generated by sparklense with all the resources information but same report I am not able to see for streaming application can someone suggest if I need to do additional code or where should I check the reports which should generate by StreamingLens.

My Sample code

      class SparkStreamingLens(spark: SparkSession, options: RequestBuilder)
      object SparkStreamingLens {
      def main(args: Array[String]): Unit = {
      println(" Spark Parameters are :")
      val igniteDetails = args(0)  
      val applicationName = args(1)
      val argumentTable = args(2)
      // options.addParameter("streamingLens.reporter.intervalMinutes", "1")
      val spark = SparkSession
      .builder()
      .appName(applicationName)
      .getOrCreate()
      val streamingLens = new SparkStreamingLens(spark, options)
      // Remaining Code to Read from Kafka and write Into Kafka(Streaming Data)
      }
      }

  spark-submit Command:

  spark-submit \
  --verbose \
  --name SparkStreamingLens \
  --num-executors 1  \
  --conf streamingLens.reporter.intervalMinutes=1  \
  --jars /home/abc/jars/spark-streaminglens_2.11-0.5.3.jar,\
 /home/abc/jars/kafka-clients-0.10.2.1.jar,\
  --master yarn \
  --deploy-mode cluster \
  --driver-cores 1  --driver-memory 2G  --executor-cores 1  --executor-memory 2G \
  --supervise --class com.jpb.datalake.SparkStreamingLens \
 /home/abc/jar/SparkStreamingLens-spark-utility_2.11-1.0.jar  \
 "jdbc:ignite:thin://00.000.00.00:00000;distributedJoins=true;user=aaaaaa;password=aaaaaaa;"  \
 SparkStreamingLens \
 argumentTable
Sep
28
2 months ago
Activity icon
issue

rpatid10 issue comment qubole/streaminglens

rpatid10
rpatid10

streaminglens didnot print recommendation?

i have written below code to my spark structured streaming program, but i doesnot print recommendation in driver log, why?

`var options: Map[String, String] = Map();

  options += ("streamingLens.expectedMicroBatchSLAMillis" -> "30000");

    options += ("streamingLens.reporter.intervalMinutes" -> "2");

    val streamingLens = new StreamingLens(spark, options);`

rpatid10
rpatid10

Hi @jiemar , I'm also facing the same issue. Kindly suggest if you are able to resolve the issue. https://github.com/qubole/streaminglens/issues/4

Previous