首页flume 根据json的字段guolv

flume 根据json的字段guolv

时间: 2023-10-18 15:16:54 浏览: 80

Flume 抽取MYSQL Oracle数据 JSON格式推送Kafka

Flume二次开发，支持抽取MYSQL Oracle数据库数据以JSON格式推送至Kafka。 demo： sql_json.sources.sql_source.type = com.hbn.rdb.source.SQLSource sql_json.sources.sql_source.connectionurl = jdbc:oracle:thin:@IP:PORT/orcl sql_json.sources.sql_source.driverclass= oracle.jdbc.driver.OracleDriver sql_json.sources.sql_source.filename = sqlSource.status sql_json.sources.sql_source.customquery = SELECT INTID,ID_NO FROM TEST.TEST sql_json.sources.sql_source.begin = 0 sql_json.sources.sql_source.autoincrementfield = INTID

Flume可以使用Flume的拦截器（Interceptor）来过滤JSON字段。 1. 首先，需要在Flume配置文件中定义一个拦截器。 ``` # Define the Interceptor agent.sources.source_name.interceptors = interceptor_name agent.sources.source_name.interceptors.interceptor_name.type = org.apache.flume.interceptor.RegexFilterInterceptor$Builder agent.sources.source_name.interceptors.interceptor_name.regex = "regex_expression" agent.sources.source_name.interceptors.interceptor_name.excludeEvents = false ``` 2. 在拦截器中使用正则表达式来匹配JSON字段。 ``` { "field1": "value1", "field2": "value2", "field3": "value3" } ``` 如果只想要匹配JSON中的field2字段，可以使用以下正则表达式。 ``` agent.sources.source_name.interceptors.interceptor_name.regex = ".*\"field2\":.*" ``` 3. 最后，将拦截器与Flume的source关联起来。 ``` agent.sources.source_name.interceptors = interceptor_name ```

阅读全文