hadoop2 - Process multiple folders in Apache spark -
i have 100 folders. each folder contains 5 files. have executable process 1 folder. executable black box , hence cannot modified.i process 100 folders in parallel using apache spark should able span map task per folder. can give me idea? have came across similar questions hadoop , answer use combinefileinputformat , pathfilter. however, said, want use apache spark. idea?
Comments
Post a Comment