Tmpri2-005.7z Apr 2026
These files typically contain curated sequences of proteins that cross cell membranes, used to distinguish between transmembrane helices, signal peptides, and globular domains.
The repository for DeepTMHMM contains the scripts and links to the underlying datasets used in the Nature Communications paper. TmPri2-005.7z
Read on Nature Communications | Source Code & Data on GitHub Context of the File These files typically contain curated sequences of proteins
This dataset is primarily used in bioinformatics for training and evaluating machine learning models related to . Associated Research Paper The core research paper associated with this dataset is: Associated Research Paper The core research paper associated
Authors: Jeppe Hallgren, Konstantinos D. Tsirigos, et al. Journal: Nature Communications (2022).
The "-005" suffix often indicates a specific cross-validation fold (e.g., the 5th split of the data) used during the model training process to ensure the AI's accuracy across different protein families. Where to Find the Data