Big data server

Big data. Gigabytes, Terabytes of different massives.

How to efficiently manage them and take only small required things from those super huge arrays? Special distributed servers, groupped in clusters. How to easily verify the results of developers are correct??

Right, automated testing help with typical tasks. Write the task by original requirement as simply as possible and compare results, that’s everything was required from us

Tags: QA, Hadoop, HDFS, YARN, WinSCP, SQuirreL Ambari, cluster