首页
学习
活动
专区
工具
TVP
发布
精选内容/技术社群/优惠产品,尽在小程序
立即前往

技术 | 数据仓库分层存储技术揭秘

据IDC发布的《数据时代2025》报告显示,全球每年产生的数据将从2018年的33ZB增长到2025年的175ZB,平均每天约产生491EB数据。随着数据量的不断增长,数据存储成本成为企业IT预算的重要组成部分。例如1PB数据存储一年,全部放在高性能存储介质和全部放在低成本存储介质两者成本差距在一个量级以上。由于关键业务需高性能访问,因此不能简单的把所有数据存放在低速设备,企业需根据数据的访问频度,使用不同种类的存储介质获得最小化成本和最大化效率。因此,把数据存储在不同层级,并能够自动在层级间迁移数据的分层存储技术成为企业海量数据存储的首选。

02
  • 您找到你想要的搜索结果了吗?
    是的
    没有找到

    Import Kafka data into OSS using E-MapReduce service

    Overview Kafka is a frequently-used message queue in open-source communities. Although Kafka (Confluent) officially provides plug-ins to import data directly from Kafka to HDFS's connector, Alibaba Cloud provides no official support for the file storage system OSS. This article will give a simple example to implement data writes from Kafka to Alibaba Cloud OSS. Because Alibaba Cloud E-MapReduce service integrates a large number of open-source components and docking tools for Alibaba Cloud, in this article, the example is directly run in the E-MapReduce cluster. This example uses the open-source Flume tool as a transit to connect Kafka and OSS. Flume open-source components may also appear on the E-MapReduce platform in the future. Scenario example Next we will name a simple example. If you already have an online Kafka cluster, you can directly jump to Step 4. 1. In the Kafka Home directory, start the Kafka service process. Configure the Zookeeper address in the configuration file to the service address emr-header-1:2181 bin/kafka-server-start.sh config/server.properties 2. Create a Kafka topic with a name of test bin/kafka-topics.sh --create --zookeeper emr-header-1:2181 \ --replication-factor 1 --partitions 1 --topic test 3. Write data to Kafka test topic and the data content is the performance monitoring data of the local machine vmstat 1 | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 4. Configure and start the Flume service in the Flume Home directory Create a new configuration file: conf/kafka-example.conf. In specific, specify the source as the corresponding topic for Kafka, and use sink as the HDFS Sinker. Specify the path as the OSS path. Because the E-MapReduce service implements an efficient OSS FileSystem (compatible with Hadoop FileSystem) for us, the OSS path can be specified directly, and the HDFS Sinker data will be automatically written to OSS. # Name the components on this agent a1.sources = source1 a1.sinks = oss1 a1.channels = c1 # Describe/configure

    03
    领券