简介:AbstractBackground:The global prevalence of nonalcoholic fatty liver disease (NAFLD) is increasing. The pathogenesis of NAFLD is multifaceted, and the underlying mechanisms are elusive. We conducted data mining analysis to gain a better insight into the disease and to identify the hub genes associated with the progression of NAFLD.Methods:The dataset GSE49541, containing the profile of 40 samples representing mild stages of NAFLD and 32 samples representing advanced stages of NAFLD, was acquired from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified using the R programming language. The Database for Annotation, Visualization and Integrated Discovery (DAVID) online tool and Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database were used to perform the enrichment analysis and construct protein-protein interaction (PPI) networks, respectively. Subsequently, transcription factor networks and key modules were identified. The hub genes were validated in a mice model of high fat diet (HFD)-induced NAFLD and in cultured HepG2 cells by real-time quantitative PCR.Results:Based on the GSE49541 dataset, 57 DEGs were selected and enriched in chemokine activity and cellular component, including the extracellular region. Twelve transcription factors associated with DEGs were indicated from PPI analysis. Upregulated expression of five hub genes (SOX9, CCL20, CXCL1, CD24, and CHST4), which were identified from the dataset, was also observed in the livers of HFD-induced NAFLD mice and in HepG2 cells exposed to palmitic acid or advanced glycation end products.Conclusion:The hub genes SOX9, CCL20, CXCL1, CD24, and CHST4 are involved in the aggravation of NAFLD. Our results offer new insights into the underlying mechanism of NAFLD progression.