Replication Data for: Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction (doi:10.21979/N9/MEDJN1)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link) (external link)

Document Description

Citation

Title:

Replication Data for: Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction

Identification Number:

doi:10.21979/N9/MEDJN1

Distributor:

DR-NTU (Data)

Date of Distribution:

2023-06-23

Version:

1

Bibliographic Citation:

Wee, JunJie; Xia, Kelin, 2023, "Replication Data for: Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction", https://doi.org/10.21979/N9/MEDJN1, DR-NTU (Data), V1

Study Description

Citation

Title:

Replication Data for: Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction

Identification Number:

doi:10.21979/N9/MEDJN1

Authoring Entity:

Wee, JunJie (Nanyang Technological University)

Xia, Kelin (Nanyang Technological University)

Software used in Production:

Python

Software used in Production:

MATLAB

Software used in Production:

numpy

Software used in Production:

scipy

Software used in Production:

scikit-learn

Software used in Production:

GUDHI

Grant Number:

M4081842.110

Grant Number:

RG109/19

Grant Number:

MOE-T2EP20120-0013

Grant Number:

MOE-T2EP20220-0010

Distributor:

DR-NTU (Data)

Access Authority:

Xia, Kelin

Depositor:

Wee, JunJie

Date of Deposit:

2023-06-12

Holdings Information:

https://doi.org/10.21979/N9/MEDJN1

Study Scope

Keywords:

Computer and Information Science, Mathematical Sciences, Medicine, Health and Life Sciences, Computer and Information Science, Mathematical Sciences, Medicine, Health and Life Sciences, Protein-protein interaction, Hodge Laplacian, Persistent spectral, Molecular featurization, Ensemble learning

Abstract:

Protein–protein interactions (PPIs) play a significant role in nearly all cellular and biological activities. Data-driven machine learning models have demonstrated great power in PPIs. However, the design of efficient molecular featurization poses a great challenge for all learning models for PPIs. Here, we propose persistent spectral (PerSpect) based PPI representation and featurization, and PerSpect-based ensemble learning (PerSpect-EL) models for PPI binding affinity prediction, for the first time. In our model, a sequence of Hodge (or combinatorial) Laplacian (HL) matrices at various different scales are generated from a specially designed filtration process. PerSpect attributes, which are statistical and combinatorial properties of spectrum information from these HL matrices, are used as features for PPI characterization. Each PerSpect attribute is input into a 1D convolutional neural network (CNN), and these CNN networks are stacked together in our PerSpect-based ensemble learning models. We systematically test our model on the two most commonly used datasets, i.e. SKEMPI and AB-Bind. It has been found that our model can achieve state-of-the-art results and outperform all existing models to the best of our knowledge.

Kind of Data:

Calculated data and codes

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Identification Number:

10.1093/bib/bbac024

Bibliographic Citation:

Wee, J., & Xia, K. (2022). Persistent spectral based ensemble learning (PerSpect-EL) for protein–protein binding affinity prediction. Briefings in Bioinformatics, 23(2).

Citation

Identification Number:

10356/162232

Bibliographic Citation:

Wee, J. & Xia, K. (2022). Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction. Briefings in Bioinformatics, 23(2).

Other Study-Related Materials

Label:

PerSpect-Ensemble-Learning.rar

Notes:

application/x-rar-compressed