Liu Xiaoli (刘小利)*,Xu Pandeng**,Liu Mingliang***,Zhu Guobin****.[J].高技术通讯(英文),2015,21(2):231~238 |
|
Design and development of real-time query platform for big data based on hadoop |
|
DOI:10.3772/j.issn.1006-6748.2015.02.017 |
中文关键词: |
英文关键词: big data, massive data storage, real-time query, Hadoop, distributed computing |
基金项目: |
Author Name | Affiliation | Liu Xiaoli (刘小利)* | | Xu Pandeng** | | Liu Mingliang*** | | Zhu Guobin**** | |
|
Hits: 907 |
Download times: 786 |
中文摘要: |
|
英文摘要: |
This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database (HBase). This platform consists of four layers including ETL (extraction transformation loading) tier, data processing tier, data storage tier and data display tier, achieving long-term store, real-time analysis and inquiry for massive data. Finally, a real dataset cluster is simulated, which are made up of 39 nodes including 2 master nodes and 37 data nodes, and performing function tests of data importing module and real-time query module, and performance tests of HDFS’s I/O, the MapReduce cluster, batch-loading and real-time query of massive data. The test results indicate that this platform achieves high performance in terms of response time and linear scalability. |
View Full Text
View/Add Comment Download reader |
Close |
|
|
|