Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream processing with Flink #649

Open
Giackgamba opened this issue Dec 21, 2023 · 1 comment
Open

Stream processing with Flink #649

Giackgamba opened this issue Dec 21, 2023 · 1 comment
Labels
question Further information is requested

Comments

@Giackgamba
Copy link

Giackgamba commented Dec 21, 2023

Background

Hi! I'm not an expert on COBOL/EBCDIC data structures, but I'm implementing a CDC scenario using Flink (in java), and I'd have some binary field to decode, given a playbook.

In the README you say that "The COBOL copybooks parser doesn't have a Spark dependency and can be reused for integrating into other data processing engines".

Question

Is it really the case? What is roughly the process to decode a single message? Are there any examples not involving the spark "wrapper"?

Thank you in advance

@Giackgamba Giackgamba added the question Further information is requested label Dec 21, 2023
@yruslan
Copy link
Collaborator

yruslan commented Dec 29, 2023

Hi, sorry for the delayed reply. Yes, Spark is not required, and you can use cobol-parser dependency that does not require Spark (still requires Scala dependency as a library).

Here is an example of Cobrix used without Spark to convert some mainframe data to JSON expressed as a unit test:
https://github.com/AbsaOSS/cobrix/blob/master/cobol-converters/src/test/scala/za/co/absa/cobrix/cobol/converters/extra/SerializersSpec.scala

One important detail. When Cobrix is used with Spark, it converts binary files to Spark dataframes and uses Spark type model. But when Spark is not used, you can use a custom RecordHandler. An example of such a handler is in the above test suite. It uses Array[Any] (in Java it would be Object[] probably.)

Let me know if you have any more questions on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants