简介:采矿诱发性是必要的提供诊断。这研究瞄准提取在多重句子或EDU(基本讲话单位)以内存在的诱发性。因为他们以某个方式成为明确,研究强调诱发性动词的使用一个原因的作为结果的事件,例如,“蚜虫从米饭叶子吮吸傻瓜。然后,叶子将缩小。后来,他们将变得黄;干燥。'.一个动词能也是在原因之间的原因动词的连接;在EDU以内完成,例如,“蚜虫从引起叶子被缩小的米饭叶子吮吸傻瓜”(“引起”用泰语等价于一个原因动词的连接)。研究面对二个主要问题:从文件识别有趣的诱发性事件;识别他们的边界。然后,我们由使用二种不同机器学习技术在动词上建议采矿,中间广场Bayes;支持向量机。结果的采矿规则将被用于鉴定;从文本的多重EDU的诱发性抽取。我们的多重EDU抽取从中间广场Bayes与0.75召回显示出0.88精确;有从支持向量机的0.76召回的0.89精确。
简介:Asemi-structureddocumenthasmorestructuredinformationcomparedtoanordinarydocument,andtherelationamongsemi-structureddocumentscanbefullyutilized.Inordertotakeadvantageofthestructureandlinkinformationinasemi-structureddocumentforbettermining,astructuredlinkvectormodel(SLVM)ispresentedinthispaper,whereavectorrepresentsadocument,andvectors'elementsaredeterminedbyterms,documentstructureandneighboringdocuments.TextminingbasedonSLVMisdescribedintheprocedureofK-meansforbriefnessandclarity:calculatingdocumentsimilarityandcalculatingclustercenter.TheclusteringbasedonSLVMperformssignificantlybetterthanthatbasedonaconventionalvectorspacemodelintheexperiments,anditsFvalueincreasesfrom0.65-0.73to0.82-0.86.
简介:AbstractObjectives:Polycystic ovary syndrome (PCOS) is a common endocrine disease in women of childbearing age. Although it is a leading cause of menstrual disorders, infertility, obesity, and other diseases, its molecular mechanism remains unclear. This study aimed to analyze the target genes, pathways, and potential drugs for PCOS through text mining.Methods:First, three different keywords ( "polycystic ovary syndrome", "obesity/adiposis", and "anovulation" ) were uploaded to GenCLiP3 to obtain three different gene sets. We then chose the common genes among these gene sets. Second, we performed gene ontology and signal pathway enrichment analyses of these common genes, followed by protein-protein interaction (PPI) network analysis. Third, the most significant gene module clustered in the protein-protein network was selected to identify potential drugs for PCOS via gene-drug analysis.Results:A total of 4291 genes related to three different keywords were obtained through text mining, 72 common genes were filtered among the three gene sets, and 69 genes participated in PPI network construction, of which 23 genes were clustered in the gene modules. Finally, six of the 23 genes were targeted by 30 existing drugs.Conclusions:The discovery of the six genes (CYP19A1, ESR1, IGF1R, PGR, PTGS2, and VEGFA) and 30 targeted drugs, which are associated with ovarian steroidogenesis (P <0.001), may be used in potential therapeutic strategies for PCOS.
简介:文章采矿,也作为发现从文章的知识,作为当前的信息爆炸的一个可能的解决方案出现了,指提取的过程知道重要;从未组织的文本的有用模式。在象聚类的文本那样的文本采矿的一般任务之中,摘要,等等,文本分类是聪明的信息处理的一项子任务,它采用从训练预言未标记的文本的类的文本构造一个分类器的无指导的学习。因为它的简洁;在性能评估的客观性,文本分类通常被用作一个标准工具决定一个文本处理方法的优点或软弱例如文本表示,文本特征选择,等等。在这篇论文,文章分类被执行分类从XSSC网站(http://www.xssc.ac.cn)收集的网文件。支持向量机器(SVM)的表演;背繁殖神经网络(BPNN)在这项任务被比较。明确地,二进制文章分类;多班文章分类在XSSC文件上被进行。而且,两个方法的分类结果被联合改进分类的精确性。一个实验被进行证明BPNN能在二进制文本分类与SVM竞争;要不是多班文章分类,SVM更好表现。而且,分类在二进制代码被改进;有联合方法的多班。
简介:WiththedevelopmentofWeb2.0,moreandmorepeoplechoosetousetheInternettoexpresstheiropinions.Allthisopinionstogetherintoanewformtextwhichcontainsalotofvaluableemotionalinformation,thisiswhyhowtodealwiththesetextsandanalysistheemotionalinformationissignificantforus.Wegetthreemaintasksofsentimentanalysis,includingsentimentextraction,sentimentclassification,sentimentapplicationandsummarization.Inthispaper,basedontheRsoftware,weintroducedthestepsofsentimentanalysisindetail.Finally,wecollectthemoviereviewsfromtheInternet,anduseRsoftwaretodosentimentanalysisinordertojudgetheemotionaltendencyofthetext.
简介:Sequentialpatternminingisanimportantdataminingproblemwithbroadapplications.However,itisalsoachallengingproblemsincetheminingmayhavetogenerateorexamineacombinatoriallyexplosivenumberofintermediatesubsequences.Recentstudieshavedevelopedtwomajorclassesofsequentialpatternminingmethods:(1)acandidategeneration-and-testapproach,representedby(i)GSP,ahorizontalformat-basedsequentialpatternminingmethod,and(ii)SPADE,averticalformat-basedmethod;and(2)apattern-growthmethod,representedbyPrefixSpananditsfurtherextensions,suchasgSpanforminingstructuredpatterns.Inthisstudy,weperformasystematicintroductionandpresentationofthepattern-growthmethodologyandstudyitsprinciplesandextensions.Wefirstintroducetwointerestingpattern-growthalgorithms,FreeSpanandPrefixSpan,forefficientsequentialpatternmining.ThenweintroducegSpanforminingstructuredpatternsusingthesamemethodology.Theirrelativeperformanceinlargedatabasesispresentedandanalyzed.Severalextensionsofthesemethodsarealsodiscussedinthepaper,includingminingmulti-level,multi-dimensionalpatternsandminingconstraint-basedpatterns.
简介:Geological Prospecting and Mining in TibetGeologicalProspectingandMininginTibet¥DONDUINAMGYISeptember1,1995markedthe30thanniv...
简介:HuainanCoalMiningBureau,aspeciallargecoalenterpriseandastatekeycoalproductionbase,issituatedincentral-northpartofAnhuiProvince.Thearea,well-knownas"thecoalcapitalofEastChina",aboundsincoalresources,andtheprovencoalreserveisestimatedtobeupto70billiontonswithcompletevarietiesandsuperiorquality.Bytheyearof2010,theannualproductioncapacitywillreach30milliontons.Thereareexcellentinvestmentenvironmentandconvenientcommunicationandtransportation
简介:语篇语言学与翻译研究,进而讨论翻译研究的语篇语言学方法以及语篇翻译研究的范围、研究重点以及研究方法,即翻译研究的语篇语言学方法