0

I am running dag for dataflow java operator in composer airflow,

default_args = {
    'retries': 0,    
    'start_date': airflow.utils.dates.days_ago(0),    
    'owner': 'Airflow_TEST',
    'dataflow_default_options': {
        'project':           'gcp_project',
        'region':            'us-west',
        "serviceAccount":    'my_airflow_composer.SA.com',
        "stagingLocation":   "gs://project/my_dir/staging/",
        'tempLocation':      'gs://project/my_dir/tmp/',
        'subnetwork':        'subnetworks/dataflow',
        'workerMachineType': 'my-vm',
        'usePublicIps':      'false',
        'filesToStage':      'lib1.jar,lib2.jar',
        'secretsPath':       'gs://gcp-secret',
        'dataflowKmsKey':    'crypto-key',
        'inputFilePath': "gs://project/my_dir/job/myfile.txt",
        'outputDirectory': "gs://project/my_dir/job/output"
    }
}

job_dataflow = DataflowCreateJavaJobOperator( gcp_conn_id='google_cloud_default', task_id='job_dataflow', jar='/tmp/lib1.jar' job_class='com.pipeline', trigger_rule='all_done', dag=dag )

but my dataflow job failing with the below error,

[2024-05-21, 15:48:16 EDT] {beam.py:113} WARNING - [main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2024-05-21T19:48:13.974Z: Autoscaling is enabled for job 2024-05-21_12_48_12-3028974577634462771. The number of workers will be between 1 and 4000.
[2024-05-21, 15:48:16 EDT] {beam.py:113} WARNING - [main] INFO org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2024-05-21T19:48:14.113Z: Autoscaling was automatically enabled for job 2024-05-21_12_48_12-3028974577634462771.
[2024-05-21, 15:48:16 EDT] {beam.py:113} WARNING - [main] ERROR org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler - 2024-05-21T19:48:14.919Z: Runnable workflow has no steps specified.
[2024-05-21, 15:48:21 EDT] {beam.py:113} WARNING - [main] INFO org.apache.beam.runners.dataflow.DataflowPipelineJob - Job 2024-05-21_12_48_12-3028974577634462771 failed with status FAILED.
[2024-05-21, 15:48:21 EDT] {beam.py:161} INFO - Process exited with return code: 0
[2024-05-21, 15:48:21 EDT] {dataflow.py:437} INFO - Start waiting for done.
[2024-05-21, 15:48:21 EDT] {dataflow.py:381} INFO - Google Cloud DataFlow job myjob-dataflow-6482d21e is state: JOB_STATE_FAILED

Any idea what I am missing here.

Olive
  • 25

0 Answers0