hadoop - What should be flume.conf parametres for save tweets to single FlumeData file per hour? -
we saving tweets in directory order /user/flume/2016/06/28/13/flumedata... .but each hour creates more 100 flumedata file.i have changed twitteragent.sinks.hdfs.hdfs.rollsize = 52428800 (50 mb)
same thing happened again.after tried changing rollcount parametre didnt work.how can set parametres 1 flumedata file per hour.
twitteragent.sinks.hdfs.channel = memchannel twitteragent.sinks.hdfs.type = hdfs twitteragent.sinks.hdfs.hdfs.path = hdfs://hpc01:8020/user/flume/tweets/%y/%m/%d/%h twitteragent.sinks.hdfs.hdfs.filetype = datastream twitteragent.sinks.hdfs.hdfs.writeformat = text twitteragent.sinks.hdfs.hdfs.batchsize = 1 twitteragent.sinks.hdfs.hdfs.rollsize = 0 twitteragent.sinks.hdfs.hdfs.rollcount = 10 twitteragent.sinks.hdfs.hdfs.rollintinterval = 0 twitteragent.channels.memchannel.type = memory twitteragent.channels.memchannel.capacity = 10000 twitteragent.channels.memchannel.transactioncapacity = 1000
Comments
Post a Comment