You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I followed the streaming documentation successfully and went through the dataflow documentation, but found it's both not up to date and doesn't work for me.
#1 for the bigquery command, this isn't in the repo: src/main/resources/errors-schema.json
#2 This appears in the logs, but no tables are ever created in BigQuery: "Executing operation polygonBlocksReadFromPubSub/PubsubUnboundedSource+polygonBlocksReadFromPubSub/MapElements/Map+polygonBlocksConvertToTableRows+polygonBlocksWriteToBigQuery/PrepareWrite/ParDo(Anonymous)+polygonBlocksWriteToBigQuery/StreamingInserts/CreateTables/ParDo(CreateTables)+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/ShardTableWrites+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/TagWithUniqueIds+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/Reshuffle/Window.Into()/Window.Assign+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/Reshuffle/GroupByKey/WriteStream"
#3 In the PolygonBlocksWriteToBigQuery stage, there are a bunch of warns, I'm not sure if that's relevant, but no more logs appear after that. I've mostly focused on getting the blocks from my local polygon to bigquery: 2022-07-07T18:55:58.630ZOperation ongoing in step polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/StreamingWrite for at least 02h10m00s without outputting or completing in state finish at [email protected]/jdk.internal.misc.Unsafe.park(Native Method) at [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at [email protected]/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447) at [email protected]/java.util.concurrent.FutureTask.get(FutureTask.java:190) at app//org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:817) at app//org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:882) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:143) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:115) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown Source)
I would really like to get this ETL job running properly so any advice would be great
The text was updated successfully, but these errors were encountered:
I followed the streaming documentation successfully and went through the dataflow documentation, but found it's both not up to date and doesn't work for me.
#1 for the bigquery command, this isn't in the repo:
src/main/resources/errors-schema.json
#2 This appears in the logs, but no tables are ever created in BigQuery:
"Executing operation polygonBlocksReadFromPubSub/PubsubUnboundedSource+polygonBlocksReadFromPubSub/MapElements/Map+polygonBlocksConvertToTableRows+polygonBlocksWriteToBigQuery/PrepareWrite/ParDo(Anonymous)+polygonBlocksWriteToBigQuery/StreamingInserts/CreateTables/ParDo(CreateTables)+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/ShardTableWrites+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/TagWithUniqueIds+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/Reshuffle/Window.Into()/Window.Assign+polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/Reshuffle/GroupByKey/WriteStream"
#3 In the PolygonBlocksWriteToBigQuery stage, there are a bunch of warns, I'm not sure if that's relevant, but no more logs appear after that. I've mostly focused on getting the blocks from my local polygon to bigquery:
2022-07-07T18:55:58.630ZOperation ongoing in step polygonBlocksWriteToBigQuery/StreamingInserts/StreamingWriteTables/StreamingWrite for at least 02h10m00s without outputting or completing in state finish at [email protected]/jdk.internal.misc.Unsafe.park(Native Method) at [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at [email protected]/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447) at [email protected]/java.util.concurrent.FutureTask.get(FutureTask.java:190) at app//org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:817) at app//org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:882) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:143) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:115) at app//org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn$DoFnInvoker.invokeFinishBundle(Unknown Source)
I would really like to get this ETL job running properly so any advice would be great
The text was updated successfully, but these errors were encountered: