文章摘要
Liu Xiaoli (刘小利)*,Xu Pandeng**,Liu Mingliang***,Zhu Guobin****.[J].高技术通讯(英文),2015,21(2):231~238
Design and development of real-time query platform for big data based on hadoop
  
DOI:10.3772/j.issn.1006-6748.2015.02.017
中文关键词: 
英文关键词: big data, massive data storage, real-time query, Hadoop, distributed computing
基金项目:
Author NameAffiliation
Liu Xiaoli (刘小利)*  
Xu Pandeng**  
Liu Mingliang***  
Zhu Guobin****  
Hits: 907
Download times: 786
中文摘要:
      
英文摘要:
      This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database (HBase). This platform consists of four layers including ETL (extraction transformation loading) tier, data processing tier, data storage tier and data display tier, achieving long-term store, real-time analysis and inquiry for massive data. Finally, a real dataset cluster is simulated, which are made up of 39 nodes including 2 master nodes and 37 data nodes, and performing function tests of data importing module and real-time query module, and performance tests of HDFS’s I/O, the MapReduce cluster, batch-loading and real-time query of massive data. The test results indicate that this platform achieves high performance in terms of response time and linear scalability.
View Full Text   View/Add Comment  Download reader
Close

分享按钮