|
楼主 |
发表于 2013-8-10 12:06:38
|
显示全部楼层
知识库致力于OMICs数据对生物标志物发现的解释
知识库的使命是收集和生物医学信息的系统化通过手工信息提取从原始发表物所谓策划过程[curationprocess]。策划过程通过对实体提取信息作图组织知识。这类知识库为OMICs数据的分析提供一些特点,允许OMICs数据覆盖至已知的通路,变化的关键性通路的鉴定和提供网络分析算法为相关基因签字后关键性分子的鉴定。在过去几年期间,已引入几个公共和商用知识库提供由一个注释的知识库和分析工具组成的整合环境以便促使进行全面功能性分析。
虽然所有这些数据库含手工策划知识,其差别在覆盖和信息的颗粒度反映信息检索方法学差异,为知识提取所用资源的变异性以及实验结果被注释器解释的差异。Shmelkov等最近曾进行一项10个公共和商用通路知识库中人监管通路质量和完整性的比较分析和惊奇地发现这些数据库的知识内容有很小重叠[63]。作者报道唯一例外是MetaCore通路数据库其内容与实验结果84%病例被验证,与之比较KEGG数据库得到低重叠24%。
Table 3. Knowledge bases to analyze OMICs data leading to biomarker discovery Knowledge baseFeaturesApplicationsRefs
MetacoreMetacore is an integrated commercial knowledge base from Thomson Reuters (previously GeneGo) which can support functional analysis (pathways, networks and maps) of span of OMICs data including microarray, sequence based gene expression, SNPs and CGH (comparative genomic hybridization) arrays, proteomics and metabolomics
Ranking of the affected pathways and networks from the experimental data based on proprietary algorithms and common functional gene expression interpretation analysis i.e. using gene ontology (GO)
Filters based on disease, tissue, sub cellular localization and functional processes to capture specific network
The toxicology application of Metacore is specifically designed to discover safety, efficacy and toxicity biomarker to a chemical compound
See: http://www.genego.com/metacore.phpBrentnall et al. in collaboration with Institute of Systems Biology completed a quantitative proteomic analysis to investigate differentially expressed proteins associated with ulcerative colitis (UC) neoplastic progression. Functional analyses of the differentially expressed proteins with Metacore software identified Sp1 and c-MYC as biomarkers of early and late stage of UC tumorigenesis
The same collaborative group made an ICAT-based quantitative proteomics research to analyze protein expression in chronic pancreatitis in comparison with normal pancreas. Metacore assisted pathway analysis revealed that c-MYC as a prominent regulator in the networks of differentially expressed proteins common in pancreatic cancer and chronic pancreatitis
Another collaborative group with Bayer Schering Pharma discovered the functional link between the KRAS mutation and Erlotinib resistance in non-small cell lung carcinoma (NSCLC). The functional analysis of the RNA expression data with Metacore indicated a possible correlation between differential expressions of cell adhesion proteins to NSCLC49, 50 and 51
IPA (A software developed by Ingenuity Systems)IPA is a manually curated commercial knowledge base from Ingenuity systems
Its biomarker filter is specialized to prioritize the molecular biomarker based on species specific connection to diseases, detection in body fluid, expression in specific cell type, cell line, clinical samples and also in stratification biomarker discovery based on disease state or drug response
The tool also can produce functional annotation of the biomarker including pathway association
See: http://www.ingenuity.com/science/knowledge_base.htmlUsing Ingenuity pathway analysis Merck & Co. predicted and then experimentally validated that phospho-PRAS40 (Thr246) positively correlates with PI3K pathway activation and AKT inhibitor sensitivity in PTEN deficient mouse prostate tumor model and triple-negative breast tumor tissues
Bristol-Myers Squibb has analyzed gene expression signature of responders and non responders to neoadjuvant ixabepilone therapy in breast cancer. Functional analysis of the data with IPA has indicated that significant deregulation of certain proliferation and cell cycle control genes can potentially predict treatment sensitivity
Cleveland clinic reported a functional analysis with IPA of the genes carrying non synonymous SNPs that may be associated with the severity of sunitinib-induced toxicity in metastatic clear cell renal cell carcinoma. As per the functional analysis those genes clustered around biological processes like interferon-γ, TNF β, TGF β 1 and amino acid metabolism molecular pathways52, 53 and 54
Pathway StudioPathway Studio is commercial software from Elsevier for pathway analysis as well as analysis of high-throughput OMICs data. Algorithms for analysis of the differential expression data such as Gene Set Enrichment Analysis (GSEA) or network analysis algorithm (NEA) allow detection of weak but consistent expression changes across the pathway genes
It is based on the proprietary databases ResNet, DiseaseFx, ChemEffect, Mamalian and Plant database containing relationships between biological molecules, chemicals, diseases and adverse events
The databases are built based on proprietary Natural Language Processing (NLP)-based relationship extraction from scientific literature
The software suit also provide state of the art network algorithm to pinpoint important nodes from the network perspective. The researcher can also visualize weight of each relationship in the pathways based on the number of literature evidence
See: http://www.pathwaystudio.com/A group from Harvard Medical School published functional connection of 117 highly differentially expressed genes to endometrial cancer. Pathway Studio assisted analysis of the data predicted that many of these genes are correlated to angiogenesis, cell proliferation and chromosomal instability. Further more they also reported ten key differentially regulated genes to be associated to tumor progression
Xiao et al. published functional analysis of EGFR regulated phosphorproteome in nasopharyngeal carcinoma (NPC) to shed light on EGFR downstream signaling. They first identified 33 unique phospho proteins by 2 dimensional difference gel electrophoresis (2D-DIGE) and mass spectrometry. Based on the proteomic data the group built EGFR signaling in NPC by using Pathway Studio and also validate GSTP1 as one of the key EGFR-regulated proteins which is involved in chemoresistance in NPC cells55 and 56
Compendia Bioscience (Oncomine)DNA copy number browser: identifying focal amplification across multiple cancer clinical data sets to identify any associated pattern
Gene expression browser: to browse differential expression of genes across multiple cancer type covering multiple data sets
Mutation browser: discovering cancer association of certain mutations by looking at the frequency of certain gene mutation
OncoScore: based on the gene expression data to stratify the patient population based on disease prognosis and response to a therapeutic intervention. At the moment the service is limited to breast and colon cancer
See: http://www.compendiabio.com/Using Oncomine a group from the University of Michigan predicted that decreased protein expression of Raf kinase inhibitor protein (RKIP) is a prognostic biomarker in prostate cancer
Another group of the same university predicted that the high expression of EZH2 and ECAD was statistically significantly associated with prostate cancer recurrence after radical prostatectomy57 and 58
NextBioNextBio Clinical:
Semantic based integration of the proprietary OMICs data with public knowledge to get better insight leads to discovery of drug targets and biomarkers
Discover and validate stratification biomarker to a therapy accessing genomic data from cell lines, stem cells, animal models and retrospective analysis of clinical trials
NextBio Research:
Identifying crucial pathways leads to a disease phenotype supported by cross studies and multiple data points
Identification of disease biomarker and analysis of pharmacokinetic profiles or toxicity indications
It uses proprietary algorithms to rank the search outcomes based on the statistical significance of the correlation supported by bioset data points
See: http://www.nextbio.com/b/nextbioCorp.nbUsing the NextBio platform Walia et al. reported that loss of breast epithelial marker hCLCA2 (chloride channel accessory protein) promotes higher risk of metastasis[59]
SelventaDiscovery of predictive response biomarkers by reverse engineering disease mechanisms a priori from molecular patients data (OMICs data)
It utilizes an extensive and manually curated knowledge base containing literature-derived triples encoded into BEL
It identifies disease- and tissue-specific biomarker content that can match targeted therapies to subpopulation of patients
Reverse Causal Reasoning (RCR) algorithm is used for identification of master regulatorsVery recently, Selventa has introduced its openBEL framework for biomarker discovery based on mechanistic causal reasoning and demonstrated its application in stratifying responders to ulcerative colitis drug, infliximab, from non-responders based on identification of IL6 as the biomarker for alternative disease mechanisms in non-responders[60]
tranSMARTA knowledge management platform enabling integration of the OMICs data with published literature, clinical trial outcome and established knowledge from Metacore, Ingenuity IPA, National Laboratory of Medicine, US (NLM)
The applications of this platform include making novel hypothesis, validating them, disease association of certain pathways, genes, SNPs and biomarker discovery
http://www.transmartproject.org/Analysis of transcriptomic data from melanoma patients using k-means clustering facility in tranSMART showed that the expression levels of cyclin D1 increase from benign to malignant, whereas in metastatic melanomas the expression level decreases, clearly delineating multiple subgroups of samples in the presumably homogenous metastatic melanoma cohort[61]
KegArrayA microarray gene expression and metabolomics data analysis tool from KEGG
Able to map OMICs data to KEGG Pathways, Brite and genome maps
See: http://www.kegg.jp/kegg/download/kegtools.htmlKegArray was used to investigate metabolic pathways associated with the marker metabolites that were detected by 2D gas chromatography mass spectrometry in tissues from 31 patients with colorectal cancer. The results led to the identification of chemically diverse marker metabolites and metabolic pathway mapping suggested deregulation of various biochemical processes[6
生物标志物分类和生物标志物知识表示已阻碍对关于生物标志物信息文献搜索。事实上转化生物标志物的质量需要宽广范围信息灵敏度,特异性,作用机制,毒性和临床表现成绩和性能的水平,强调需要对生物标志物词汇和分类的标准化。最近,对生物标志物和诊断为建立证据标准已经建议一个典型的过程确保生物标志物根据甚至科学证据类型的定性[64]。相似地,Pistoia联盟,最初由来自几个制药公司信息专家建立,为整合生物标志物分析数据和处理不同终点曾推出一个计划集中发展一个肿瘤学和数据标准[PistoiaAlliance: http://www.pistoiaalliance.org/]。虽然在初步阶段,可以形成这类发展为未来生物标志物标准化努力基础。因此,下一代知识库应解决上述引入有效信息检索/提取工具以及生物标志物数据标准。放在一起,伴随已有知识库有优点和缺点两方面,在表4中总结。
|
|