| Liu Xiaoli (刘小利)*,Xu Pandeng**,Liu Mingliang***,Zhu Guobin****.[J].高技术通讯(英文),2015,21(2):231~238 |
|
| Design and development of real-time query platform for big data based on hadoop |
| |
| DOI:10.3772/j.issn.1006-6748.2015.02.017 |
| 中文关键词: |
| 英文关键词: big data, massive data storage, real-time query, Hadoop, distributed computing |
| 基金项目: |
| Author Name | Affiliation | | Liu Xiaoli (刘小利)* | | | Xu Pandeng** | | | Liu Mingliang*** | | | Zhu Guobin**** | |
|
| Hits: 2390 |
| Download times: 2397 |
| 中文摘要: |
| |
| 英文摘要: |
| This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database (HBase). This platform consists of four layers including ETL (extraction transformation loading) tier, data processing tier, data storage tier and data display tier, achieving long-term store, real-time analysis and inquiry for massive data. Finally, a real dataset cluster is simulated, which are made up of 39 nodes including 2 master nodes and 37 data nodes, and performing function tests of data importing module and real-time query module, and performance tests of HDFS’s I/O, the MapReduce cluster, batch-loading and real-time query of massive data. The test results indicate that this platform achieves high performance in terms of response time and linear scalability. |
|
View Full Text
View/Add Comment Download reader |
| Close |
|
|
|