Technologies and Infrastructure for Big Data
Entry requirements: basic knowledge of programming and web technologies, SQL\DBMS knowledges
Language of the course: English
- Identification of the main reasons for the formation of Big Data and current identification of Big Data
- Introducing to data processing technologies: Grid, WMS, MapReduce
- Overview of MapReduce core and Apache Hadoop technology
- Overview of HDFS and basic infrastructure of Apache Hadoop
- Introducing to Apache Spark technology and Apache Streaming
Big Data Technology certainly occupies a core role in development of modern software solutions in large industrial companies. Today, efficient processing and analyzing of Big Data bring the basis for successful business development, as well as an advantage among business rivals in industry competition. That is why this course is oriented towards development of student’s skills in Big Data processing and analyzing area. The course covers a brief description of the Big Data history, its nowadays definition and identification. Then HDFS distributed storage file system foundations will be studied, as well as the basics of Apache Hadoop and MapReduce technologies. During the course the technology of Apache Spark and Spark Streaming will be also covered. On completion of the study the student will have skills to work with the basic Big Data technologies, such as, Apache Hadoop and Apache Spark.
Lectures and workshops
Attendance is mandatory.
Grading: 60% course work: 20% data crawler, 20% implementation on Big Data technologies, 20% data analysis and reporting; 20% work in workshops; 20% final examination test.