Big Data

Credits: 6

Semester: 3

Course: Elective

Language of the course: English


  • Study of the main reasons of Big Data formation. It’s detection and identification.
  • Introduction to Grid technologies, WMS, MapReduce, stream data processing
  • Understanding of MapReduce principles and Apache Hadoop technology
  • Understanding of HDFS principles and building of Apache Hadoop infrastructure
  • Introducing to Apache Storm Technology



Main topics of the discipline:

  • Definition of the term large data and the basic model. Use of large data. The role of large data in the national economy.
  • Requirements for the profession of analytics of large data.
  • The main stages of the life cycle. Collection, consolidation and cleaning of data.
  • Correlation coefficient. Graphical representation. Statement of the problem of regression analysis. Linear regression. Least square method. Their role in the analysis of large data.
  • Data collection and consolidation, data visualization, R language for analytics, work with DBMS.
  • Hadoop, HDFS, Map / Reduce, YARN, Storm, Apache Spark.
  • Importance of the phenomenon of large data for the development of society and science. Causes of the trend of large data.
  • Problems and opportunities associated with the appearance of large data.


Lectures and laboratory works.