Stream processing with Flink #649

Giackgamba · 2023-12-21T15:19:41Z

Background

Hi! I'm not an expert on COBOL/EBCDIC data structures, but I'm implementing a CDC scenario using Flink (in java), and I'd have some binary field to decode, given a playbook.

In the README you say that "The COBOL copybooks parser doesn't have a Spark dependency and can be reused for integrating into other data processing engines".

Question

Is it really the case? What is roughly the process to decode a single message? Are there any examples not involving the spark "wrapper"?

Thank you in advance

yruslan · 2023-12-29T21:28:38Z

Hi, sorry for the delayed reply. Yes, Spark is not required, and you can use cobol-parser dependency that does not require Spark (still requires Scala dependency as a library).

Here is an example of Cobrix used without Spark to convert some mainframe data to JSON expressed as a unit test:
https://github.com/AbsaOSS/cobrix/blob/master/cobol-converters/src/test/scala/za/co/absa/cobrix/cobol/converters/extra/SerializersSpec.scala

One important detail. When Cobrix is used with Spark, it converts binary files to Spark dataframes and uses Spark type model. But when Spark is not used, you can use a custom RecordHandler. An example of such a handler is in the above test suite. It uses Array[Any] (in Java it would be Object[] probably.)

Let me know if you have any more questions on this.

Giackgamba added the question Further information is requested label Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream processing with Flink #649

Stream processing with Flink #649

Giackgamba commented Dec 21, 2023 •

edited

Loading

yruslan commented Dec 29, 2023

Stream processing with Flink #649

Stream processing with Flink #649

Comments

Giackgamba commented Dec 21, 2023 • edited Loading

Background

Question

yruslan commented Dec 29, 2023

Giackgamba commented Dec 21, 2023 •

edited

Loading