Complex Network Analysis for Scientific Collaboration Prediction and Biological Hypothesis Generation

dc.contributor.advisorSusan McRoy
dc.contributor.advisorHong Yu
dc.contributor.committeememberHong Yu
dc.contributor.committeememberSusan McRoy
dc.contributor.committeememberChristine Cheng
dc.contributor.committeememberRohit J. Kate
dc.contributor.committeememberPeter J. Tonellato
dc.creatorZhang, Qing
dc.date.accessioned2025-01-16T19:58:14Z
dc.date.issued2014-08-01
dc.description.abstractWith the rapid development of digitalized literature, more and more knowledge has been discovered by computational approaches. This thesis addresses the problem of link prediction in co-authorship networks and protein--protein interaction networks derived from the literature. These networks (and most other types of networks) are growing over time and we assume that a machine can learn from past link creations by examining the network status at the time of their creation. Our goal is to create a computationally efficient approach to recommend new links for a node in a network (e.g., new collaborations in co-authorship networks and new interactions in protein--protein interaction networks). We consider edges in a network that satisfies certain criteria as training instances for the machine learning algorithms. We analyze the neighborhood structure of each node and derive the topological features. Furthermore, each node has rich semantic information when linked to the literature and can be used to derive semantic features. Using both types of features, we train machine learning models to predict the probability of connection for the new node pairs. We apply our idea of link prediction to two distinct networks: a co-authorship network and a protein--protein interaction network. We demonstrate that the novel features we derive from both the network topology and literature content help improve link prediction accuracy. We also analyze the factors involved in establishing a new link and recurrent connections.
dc.description.embargo2016-11-24
dc.embargo.liftdate2016-11-24
dc.identifier.urihttp://digital.library.wisc.edu/1793/88673
dc.relation.replaceshttps://dc.uwm.edu/etd/786
dc.subjectCitation Network
dc.subjectConditional Random Fields
dc.subjectGraph Analysis
dc.subjectMachine Learning
dc.subjectProtein-protein Interaction
dc.subjectSocial Network
dc.titleComplex Network Analysis for Scientific Collaboration Prediction and Biological Hypothesis Generation
dc.typedissertation
thesis.degree.disciplineEngineering
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_uwm_0263D_10760.pdf
Size:
2.18 MB
Format:
Adobe Portable Document Format
Description:
Main File