java - Fail to read from bigtable in dataflow -
i using dataflow work write data bigtable.
currently, got task read rows bigtable.
however, whenever try read rows bigtable using bigtable-hbase-dataflow, fails , complains follow.
error: (3218070e4dd208d3): java.lang.illegalargumentexception: b <= @ org.apache.hadoop.hbase.util.bytes.iterateonsplits(bytes.java:1720) @ org.apache.hadoop.hbase.util.bytes.split(bytes.java:1683) @ org.apache.hadoop.hbase.util.bytes.split(bytes.java:1664) @ com.google.cloud.bigtable.dataflow.cloudbigtableio$abstractsource.split(cloudbigtableio.java:512) @ com.google.cloud.bigtable.dataflow.cloudbigtableio$abstractsource.getsplits(cloudbigtableio.java:358) @ com.google.cloud.bigtable.dataflow.cloudbigtableio$source.splitintobundles(cloudbigtableio.java:593) @ com.google.cloud.dataflow.sdk.runners.worker.workercustomsources.performsplit(workercustomsources.java:413) @ com.google.cloud.dataflow.sdk.runners.worker.workercustomsources.performsplitwithapilimit(workercustomsources.java:171) @ com.google.cloud.dataflow.sdk.runners.worker.workercustomsources.performsplit(workercustomsources.java:149) @ com.google.cloud.dataflow.sdk.runners.worker.sourceoperationexecutor.execute(sourceoperationexecutor.java:58) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworker.executework(dataflowworker.java:288) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworker.dowork(dataflowworker.java:221) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworker.getandperformwork(dataflowworker.java:173) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworkerharness$workerthread.dowork(dataflowworkerharness.java:193) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworkerharness$workerthread.call(dataflowworkerharness.java:173) @ com.google.cloud.dataflow.sdk.runners.worker.dataflowworkerharness$workerthread.call(dataflowworkerharness.java:160) @ java.util.concurrent.futuretask.run(futuretask.java:266) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) @ java.lang.thread.run(thread.java:745)
i using 'com.google.cloud.dataflow:google-cloud-dataflow-java-sdk-all:1.6.0' , 'com.google.cloud.bigtable:bigtable-hbase-dataflow:0.9.0' now.
here's code.
cloudbigtablescanconfiguration config = new cloudbigtablescanconfiguration.builder() .withprojectid("project-id") .withinstanceid("instance-id") .withtableid("table") .build(); pipeline.apply(read.<result>from(cloudbigtableio.read(config))) .apply(pardo.of(new test()));
fyi, read bigtable , count rows using aggregator in test dofn.
static class test extends dofn<result, result> { private static final long serialversionuid = 0l; private final aggregator<long, long> rowcount = createaggregator("row_count", new sum.sumlongfn()); @override public void processelement(processcontext c) { rowcount.addvalue(1l); c.output(c.element()); } }
i followed tutorial on dataflow document, fails. can me out?
the root cause dependency issue:
previously, our build file omitted dependency:
compile 'io.netty:netty-tcnative-boringssl-static:1.1.33.fork22'
today, added dependency , resolved issues. double-checked problem arises when don't have in build file.
from https://github.com/googlecloudplatform/cloud-bigtable-client/issues/912#issuecomment-249999380.
Comments
Post a Comment