Cluster classification

WARNING: Under construction!!!


Sorry for the dust! We’re working hard to make this website available.

Links might fail, content might be incomplete and layout might be very ugly.

Cluster classification

“classification.sh”

  1. Unannotated clusters:
    • “cluster_ffindex_files.sh”: input=”data/cluster_refinement/refined/marine_hmp_refined_noannot_cl_ids.txt”; output=”data/cluster_classification/unannotated_cl/marine_hmp_refined_noannot_cl_cons.fasta”
    • “double_search.sh”:
    • Search vs Uniref90: input=”data/cluster_classification/unannotated_cl/marine_hmp_refined_noannot_cl_cons.fasta” and the Uniref90 DB; output=search hits, which are then parsed with the scripts “hypo_parser.sh” (which requires the awk script “evalue_06_filter.awk” and the file “unknown_grep.tsv”).
    • Search vs NCBI nr: input=previous search no-hits; output=search hits, parsed in the same way of the uniref90 hits. - general output: set of EUs and KWPs, and the preliminary set of GUs.
  2. Annotated clusters:
    • “pfam_domain_architect_ref.r”: input=”data/cluster_refinement/refined/marine_hmp_refined_annot_cl.tsv”; output=”data/cluster_classification/annotated_cl/kept_PF/DUFs” (the first are the Ks and the second are added to the preliminary set of GUs) and “data/cluster_classification/annotated_cl/pfam_domain_architecture.tsv”.
  • Final output: a first set of cluster categories (Ks,KWPs,GUs,and EUs) that will be further refined through two HMM vs HMM searches.

Let's Get In Touch!


Ready to start your next project with us? That's great! Give us a call or send us an email and we will get back to you as soon as possible!