Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(c++,spark): clean the output directory when generating data in unit test #584

Open
acezen opened this issue Aug 12, 2024 · 8 comments
Open
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@acezen
Copy link
Contributor

acezen commented Aug 12, 2024

Describe the bug, including details regarding any error messages, version, and platform.

In c++ or spark unit test, we usually creating the graphar data to /tmp dir and may check the generated file num with an assert. But the c++ and spark unit test may generate useless files for each other and make the assertion failed.

I suggest we can clean the output directory before write out the files in unit test of c++ and spark.

Solution

As Sem suggested, we can making the clean operation as a part of the top-level make clean of C++/Spark library.

Component(s)

C++, Spark

@acezen acezen added bug Something isn't working good first issue Good for newcomers labels Aug 12, 2024
@SemyonSinchenko
Copy link
Member

What do you think about making it a part of the top-level make clean command?

@acezen acezen changed the title bug(c++,spark): clear the output directory when generating data in unit test bug(c++,spark): clean the output directory when generating data in unit test Aug 12, 2024
@acezen
Copy link
Contributor Author

acezen commented Aug 12, 2024

What do you think about making it a part of the top-level make clean command?

Good advice and that apply to c++ too!

@SumitkumarSatpute
Copy link

I came across this issue and would love to help out. Is there any additional information or context I should be aware of before I get started?

Looking forward to contributing!

@acezen
Copy link
Contributor Author

acezen commented Aug 13, 2024

I came across this issue and would love to help out. Is there any additional information or context I should be aware of before I get started?

Looking forward to contributing!

Hi, @SumitkumarSatpute , thanks for the interest to GraphAr. the generated temporary data is generated by unit tests of write:

So I think you need to clean the output directories like /tmp/vertex, /tmp/edge , /tmp/ldbc base on the unit tests and clean them with the top level make clean of the libraries.

Feel free to ask if you have any question and enjoy the trip:)

@SemyonSinchenko
Copy link
Member

I see it in the following way:

  • we have Makefiles in each of subproject (cpp, maven, pyspark already has one);
  • each Makefile of the subproject contains a clean command that delete all the created temporary data, all the generated code, all the compiled classes, etc. For maven it should be something like mvn clean and also deleting of the corresponded tmp folder and downloaded artifacts like spark binaries;
  • we have a top level Makefile with a clean command that just runs one by one clean in subprojects

@acezen
Copy link
Contributor Author

acezen commented Aug 13, 2024

I see it in the following way:

  • we have Makefiles in each of subproject (cpp, maven, pyspark already has one);
  • each Makefile of the subproject contains a clean command that delete all the created temporary data, all the generated code, all the compiled classes, etc. For maven it should be something like mvn clean and also deleting of the corresponded tmp folder and downloaded artifacts like spark binaries;
  • we have a top level Makefile with a clean command that just runs one by one clean in subprojects

Good supplement, thanks Sem.

@SumitkumarSatpute
Copy link

Please let me know how to reproduce this scenario in case of C++ , SPARK or others on this matter.

@SemyonSinchenko
Copy link
Member

For maven it is enough to run tests like they are running in CI: mvn test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants