Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to parse huge x12 documents #43

Open
nddipiazza opened this issue Dec 29, 2020 · 4 comments
Open

Unable to parse huge x12 documents #43

nddipiazza opened this issue Dec 29, 2020 · 4 comments

Comments

@nddipiazza
Copy link
Contributor

What is the biggest files you have been able to parse with this parser?

I need to parse 837 files with thousands of claims in them.

To get unblocked, I added a split837 method to the X12Reader which is doing a map reduce to take the huge x12 file and split into chunks. I split at child loops at the DETAIL loop.

Once the 837 is split into chunks, I just operate on those chunks separately using the normal parse method.

Anyone else have anything similar they had to do?

@angelaszek
Copy link
Collaborator

We haven't run into this issue since we don't process huge X12 documents. Handling large files better is something we would like to look into implementing. Unfortunately, it will be a little while before I will be able to work on that.

@nddipiazza
Copy link
Contributor Author

nddipiazza commented Dec 29, 2020

OK I have a solution in place for the time being, but it is specific to 837's. it finds the detail loop which is the loop that can have 1000's of child loops, then splits the files into smaller files. then you can use the normal parse on the smaller files. when done processing the smaller files, you can then assemble it if you need to.

Could also use MapDB disk-based map instead of the heap datastructures, but that will be slower.

@bitbythecron
Copy link

OK I have a solution in place for the time being, but it is specific to 837's. it finds the detail loop which is the loop that can have 1000's of child loops, then splits the files into smaller files. then you can use the normal parse on the smaller files. when done processing the smaller files, you can then assemble it if you need to.

Could also use MapDB disk-based map instead of the heap datastructures, but that will be slower.

Hi @nddipiazza just curious, what sized documents were you processing when you ran into this issue? What errors were you seeing? And do you remember any details for how you chunked the files up into multiple smaller ones? Thanks in advance for any-and-all help here!

@thesammiller
Copy link

Hello @bitbythecron were you able to learn anything about this issue? I am processing 837s with multiple claims in each file, but I am only getting one claim loop for each file and I am otherwise seeing errors. Were you able to make any progress on chunking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants