Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_FragGeneScan.pl error - Hmmsearch failed #36

Open
Georgios-Filis opened this issue Apr 1, 2024 · 2 comments
Open

run_FragGeneScan.pl error - Hmmsearch failed #36

Georgios-Filis opened this issue Apr 1, 2024 · 2 comments

Comments

@Georgios-Filis
Copy link

Georgios-Filis commented Apr 1, 2024

Dear Ziye Wang,
I am facing an error and I need your help and insight. I have installed MetaBinner via source code (based on the metabinner_env.yaml file). The processes related to the coverage profile and the composition profile run without a problem. When running "run_metabinner.sh" via the following command:

/home/user/work/rte/PSP_06_03_2024/ps_tools/metabinner/MetaBinner/run_metabinner.sh -a /home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa -o /home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins -d /home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/coverages/coverage_profile_f200.tsv -k /home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_kmer_4_f200.csv -p /home/user/work/rte/PSP_06_03_2024/ps_tools/metabinner/MetaBinner -t 8

I get the following error message:

2024-04-01 05:53:38,578 - Input arguments:
2024-04-01 05:53:38,578 - Contig_file:	/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa
2024-04-01 05:53:38,578 - Coverage_profiles:	/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/coverages/coverage_profile_f200.tsv
2024-04-01 05:53:38,578 - Composition_profiles:	/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_kmer_4_f200.csv
2024-04-01 05:53:38,579 - Output file path:	/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/result.tsv
2024-04-01 05:53:38,579 - Predefined Clusters:	Auto
2024-04-01 05:53:38,579 - The number of threads:	8
2024-04-01 05:53:39,027 - The number of contigs:	8831
2024-04-01 05:53:39,027 - gen bacar marker seed
2024-04-01 05:53:39,028 - exec cmd: run_FragGeneScan.pl -genome=/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa -out=/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.frag -complete=0 -train=complete -thread=8 1>/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.frag.out 2>/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.frag.err
2024-04-01 05:54:03,153 - exec cmd: hmmsearch --domtblout /home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.bacar_marker.hmmout --cut_tc --cpu 8 /home/rte/PSP_06_03_2024/ps_tools/metabinner/MetaBinner/scripts/../auxiliary/bacar_marker.hmm /home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.frag.faa 1>/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.bacar_marker.hmmout.out 2>/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.bacar_marker.hmmout.err
2024-04-01 05:54:03,185 - Hmmsearch failed! Not exist: /home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.bacar_marker.hmmout

real	0m26.021s
user	0m38.726s
sys	0m23.824s
cp: cannot stat '/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/intermediate_result/kmeans_length_weight_X_t_logtrans_result.tsv': No such file or directory
Processing file:	/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa
Reading Map:	/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/unitem_profile/kmeans_length_weight_X_t_logtrans_result.tsv
Traceback (most recent call last):
  File "/home/user/work/rte/PSP_06_03_2024/ps_tools/metabinner/MetaBinner/scripts/gen_bins_from_tsv.py", line 74, in <module>
    main(args.f, args.r, args.o)
  File "/home/user/work/rte/PSP_06_03_2024/ps_tools/metabinner/MetaBinner/scripts/gen_bins_from_tsv.py", line 40, in main
    with open(resultfile, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/unitem_profile/kmeans_length_weight_X_t_logtrans_result.tsv'
Input directory does not exists: /home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/unitem_profile/kmeans_length_weight_X_t_logtrans_result.tsv_bins

['X_t_logtrans_ori', '/home/rte/PSP_06_03_2024/results_sample_19/metabinner_results/bins/metabinner_res/unitem_profile/kmeans_length_weight_X_t_logtrans_result.tsv_bins']
Something went wrong with running unitem_profile.py. Please check CheckM installation. Exiting.

I have traced the error back to when "run_FragGeneScan.pl" is run which outputs an empty "contigs_formated_200.fa.frag.faa" file, which in turn results in no "contigs_formated_200.fa.bacar_marker.hmmout" file being created (thus Hmmsearch fails). FragGeneScan also outputs the "contigs_formated_200.fa.frag.err" file which contains the following:

awk: cmd. line:1: fatal: cannot open file `/home/rte/PSP_06_03_2024/results_sample_19/contigs/contigs_formated_200.fa.frag.out' for reading (No such file or directory)
  1. Do you have an idea of what is going on?
  2. In addition, can MetaBinner be installed and run properly in the conda enviroment with other versions of Python except of 3.7.6?
  3. Would it be ok if I installed MetaBinner first in the environment and then python 3.7.6?

Sincerely,
Georgios Filis

@ziyewang
Copy link
Owner

ziyewang commented Apr 1, 2024

Hi,

It seems that no relevant genes have been identified. MetaBinner typically runs on sequences longer than 1000bp. If you wish to run it on sequences longer than 200bp, you can try adding the parameter '-l 200'. However, it's not guaranteed that you will obtain suitable results.

MetaBinner can potentially be installed and run properly in a conda environment with versions of Python other than 3.7.6.

Best,
Ziye

@Georgios-Filis
Copy link
Author

Georgios-Filis commented Apr 2, 2024

Dear Ziye Wang,
Thank you for your immediate response. After looking further into the matter I realized that the problem was FragGeneScan not being able to produce the last total "out" file when being used with the "1> and 2>" output redirections. When being used without the latter redirections I believe that the "out" file is normally generated, even when multiple threads are used. Please do correct me if I am mistaken on the latter observation. I found out that this problem is probably based on my system having an older version of GLIBC (glibc/libc6). Unfortunately, I can not update its version. Would it be possible to add an option at MetaBinner (and ideally at COMEBin too) so that the redirections may not occur or that FragGeneScanRs may be used instead of FragGeneScan? I have checked that the total "out" file is generated when FragGeneScanRs is used with redirections. I believe that any of these options would be really helpful.

Edit:
I have tried removing only the "1> ..." redirection in MetaBinner from the FragGeneScan command ("fragCmd") from the files "split_hbins.py" and "component_binning.py" and similarly in COMEBin from the file "utils.py". I have seen that in both cases the output is generated as expected. It would be very helpful for me if you could confirm that these modifications, in MetaBinner and COMEBIn, do not alter the job of another process of the general pipeline which maybe I have not taken into account.

Sincerely,
Georgios Filis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants