Object

org.apache.spark.sql.pipelines.graph.GraphExecution

All Implemented Interfaces:: org.apache.spark.internal.Logging

Direct Known Subclasses:: TriggeredGraphExecution

public abstract class GraphExecution extends Object implements org.apache.spark.internal.Logging

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static interface

GraphExecution.FlowExecutionAction

static interface

GraphExecution.FlowExecutionStopReason

Represents the reason why a flow execution should be stopped.

static class

GraphExecution.RetryFlowExecution$

Indicates that the flow execution should be retried.

static class

GraphExecution.StopFlowExecution

Indicates that the flow execution should be stopped with a specific reason.

static class

GraphExecution.StopFlowExecution$

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

GraphExecution(DataflowGraph graphForExecution, PipelineUpdateContext env)
Method Summary

Modifier and Type

Method

Description

abstract void

awaitCompletion()

Blocks the current thread while any flows are queued or running.

static GraphExecution.FlowExecutionAction

determineFlowExecutionActionFromError(scala.Function0<Throwable> ex, scala.Function0<String> flowDisplayName, scala.Function0<Object> currentNumTries, scala.Function0<Object> maxAllowedRetries)

Analyze the exception thrown by flow execution and figure out if we should retry the execution, or we need to reanalyze the flow entirely to resolve issues like schema changes.

scala.collection.concurrent.TrieMap<org.apache.spark.sql.catalyst.TableIdentifier,FlowExecution>

flowExecutions()

FlowExecutions currently being executed and tracked by the graph execution.

abstract RunTerminationReason

getRunTerminationReason()

Returns the reason why this flow execution has terminated.

DataflowGraph

graphForExecution()

static org.apache.spark.internal.Logging.LogStringContext

LogStringContext(scala.StringContext sc)

int

maxRetryAttemptsForFlow(org.apache.spark.sql.catalyst.TableIdentifier flowName)

static org.slf4j.Logger

org$apache$spark$internal$Logging$$log_()

static void

org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)

scala.Option<FlowExecution>

planAndStartFlow(ResolvedFlow flow)

Plans the logical ResolvedFlow into a FlowExecution and then starts executing it.

void

start()

Starts the execution of flows in graphForExecution.

void

stop()

Stops this execution by stopping all streams and terminating any other resources.

void

stopFlow(FlowExecution pf)

Stops execution of a `FlowExecution`.

void

stopThread(Thread thread)

Stop a thread timeout.

abstract Trigger

streamTrigger(Flow flow)

The `Trigger` configuration for a streaming flow.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- GraphExecution
  
  public GraphExecution(DataflowGraph graphForExecution, PipelineUpdateContext env)
Method Details
- determineFlowExecutionActionFromError
  
  public static GraphExecution.FlowExecutionAction determineFlowExecutionActionFromError(scala.Function0<Throwable> ex, scala.Function0<String> flowDisplayName, scala.Function0<Object> currentNumTries, scala.Function0<Object> maxAllowedRetries)
  
  Analyze the exception thrown by flow execution and figure out if we should retry the execution, or we need to reanalyze the flow entirely to resolve issues like schema changes. This should be the narrow waist for all exception analysis in flow execution. TODO: currently it only handles schema change and max retries, we should aim to extend this to include other non-retryable exception as well so we can have a single SoT for all these error matching logic.
  
  Parameters:
  
  ex - Exception to analyze.
  
  flowDisplayName - The user facing flow name with the error.
  
  currentNumTries - Number of times the flow has been tried.
  
  maxAllowedRetries - Maximum number of retries allowed for the flow.
  
  Returns:
  
  (undocumented)
- org$apache$spark$internal$Logging$$log_
  
  public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
- org$apache$spark$internal$Logging$$log__$eq
  
  public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
- LogStringContext
  
  public static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc)
- graphForExecution
  
  public DataflowGraph graphForExecution()
- streamTrigger
  
  public abstract Trigger streamTrigger(Flow flow)
  
  The `Trigger` configuration for a streaming flow.
- flowExecutions
  
  public scala.collection.concurrent.TrieMap<org.apache.spark.sql.catalyst.TableIdentifier,FlowExecution> flowExecutions()
  
  FlowExecutions currently being executed and tracked by the graph execution.
  
  Returns:
  
  (undocumented)
- planAndStartFlow
  
  public scala.Option<FlowExecution> planAndStartFlow(ResolvedFlow flow)
  
  Plans the logical ResolvedFlow into a FlowExecution and then starts executing it. Implementation note: Thread safe
  
  Parameters:
  
  flow - (undocumented)
  
  Returns:
  
  None if the flow planner decided that there is no actual update required here. Otherwise returns the corresponding physical flow.
- start
  
  public void start()
  
  Starts the execution of flows in graphForExecution. Does not block.
- stop
  
  public void stop()
  
  Stops this execution by stopping all streams and terminating any other resources.
  This method may be called multiple times due to race conditions and must be idempotent.
- stopFlow
  
  public void stopFlow(FlowExecution pf)
  
  Stops execution of a `FlowExecution`.
- awaitCompletion
  
  public abstract void awaitCompletion()
  
  Blocks the current thread while any flows are queued or running. Returns when all flows that could be run have completed. When this returns, all flows are either SUCCESSFUL, TERMINATED_WITH_ERROR, SKIPPED, CANCELED, or EXCLUDED.
- getRunTerminationReason
  
  public abstract RunTerminationReason getRunTerminationReason()
  
  Returns the reason why this flow execution has terminated. If the function is called before the flow has not terminated yet, the behavior is undefined, and may return UnexpectedRunFailure.
  
  Returns:
  
  (undocumented)
- maxRetryAttemptsForFlow
  
  public int maxRetryAttemptsForFlow(org.apache.spark.sql.catalyst.TableIdentifier flowName)
- stopThread
  
  public void stopThread(Thread thread)
  
  Stop a thread timeout.
  
  Parameters:
  
  thread - (undocumented)

Class GraphExecution

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

GraphExecution

Method Details

determineFlowExecutionActionFromError

org$apache$spark$internal$Logging$$log_

org$apache$spark$internal$Logging$$log__$eq

LogStringContext

graphForExecution

streamTrigger

flowExecutions

planAndStartFlow

start

stop

stopFlow

awaitCompletion

getRunTerminationReason

maxRetryAttemptsForFlow

stopThread