hadoop2 - Process multiple folders in Apache spark -


i have 100 folders. each folder contains 5 files. have executable process 1 folder. executable black box , hence cannot modified.i process 100 folders in parallel using apache spark should able span map task per folder. can give me idea? have came across similar questions hadoop , answer use combinefileinputformat , pathfilter. however, said, want use apache spark. idea?


Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -