Exploration on Deep Drug Discovery: Representation and Learning

Loading...
Thumbnail Image

Authors

Liu, Shengchao

Advisors

License

DOI

Type

Technical Report

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

Abstract

Virtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the dataset and evaluation strategy. We start by considering a wide range of ligand-based machine learning and docking-based approaches for virtual screening, and present a strategy for choosing which algorithm is best for prospective compound prioritization. During this process, we find that input information may affect the model performance. Thus we emphasize the impacts of different levels of molecule representation and introduce N-gram graph, a novel representation for a molecular graph. N-gram graph on traditional machine learning models is able to reach the state-of-the-art performance. Another issue we observe is that multi-task learning can negatively impact the performance on some individual tasks. We propose a reinforced multi-task learning (RMTL) framework, and preliminary results show that RMTL can address the issue in the two-task cases.

Description

Related Material and Data

Citation

TR1854

Sponsorship

Endorsement

Review

Supplemented By

Referenced By