One of the primary contributors to the mortality of women is breast cancer. Several approaches are used to cure it, but recurrence occurs in 79% of the cases because the underlying mechanism of the protein molecules is not carefully ex-amined. The goal of this research was to use machine-learning tools is to elucidate conserved regions and to obtain functional annotations of breast-cancer-related proteins. The sequences of five breast-cancer-related proteins (BRCA2, BCAR1, BCAR3, BCAR4, and BRMS1) and their annotations were retrieved from the UniProt and TCGA databases, respectively. Conserved regions were extracted using CLUSTALX. We constructed a phylogenetic tree using the MEGA 7.0. SUPERFAMILY database to obtain fine-grained domain annotation. The tree revealed that the BRCA2 and BCAR4 protein sequences are located in a clade, which indicates that they have overlapping functions. Several protein domains were identified, including the SH2 and Ras GEF domains in BCAR3, the SH3 domain in BCAR1, and the BRCA2 helical domain, the nucleic-acid-binding protein, and tower domain. We found that no protein domains could be annotated for BCAR4 or BRMS1, which may indicate the presence of a disordered protein state. We suggest that each protein has distinct functionalities that are complementary in regulating the progression of breast cancer, although further study is necessary for confirmation. This protein-domain annotation project could be leveraged by the complete integration of mapping with respect to gene and disease ontology. This type of leverage is vital for obtaining biochemical insights regarding breast cancer.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.