Home/ Majors directory/Information Systems and Technologies/Methods of Big Data Processing
Methods of Big Data Processing
Major: Information Systems and Technologies
Code of subject: 7.126.01.O.003
Credits: 7.00
Department: Information Systems and Networks
Lecturer: doctor of sciences, professor Andrii Berko
Semester: 1 семестр
Mode of study: денна
Завдання: The study of an educational discipline involves the formation of competencies in students of education:
General competencies:
INT. The ability to solve problems of a research and innovation nature in the field of information systems and technologies.
ZK01. Ability to think abstractly, analyze, and synthesize.
ZK05. Ability to evaluate and ensure the quality of work performed in ICT.
Professional competencies:
SK01. Ability to develop and apply IST necessary for solving strategic and current tasks.
SK04. The ability to develop mathematical, information, and computer models of objects and informatization processes.
SK05. Ability to use modern data analysis technologies to optimize processes in information systems.
SK06. Ability to manage information risks based on the concept of information security.
SK07. Develop and implement innovative projects in the field of ICT.
SK08. The ability to conduct scientific and scientific-pedagogical activities in the field of ICT.
Learning outcomes: As a result of studying the academic discipline, the student must be able to demonstrate the following learning outcomes:
PH05. Determine the requirements for ICT based on the analysis of business processes and analysis of the needs of interested parties, and develop technical tasks.
PH06. Justify the choice of technical and software solutions, taking into account their interaction and potential impact on solving organizational problems, and organize their implementation and use.
PH10. Provide high-quality cyber protection of ICT, plan, organize, implement, and monitor the functioning of information protection systems.
PH12. Plan and carry out scientific research in the field of ICT, formulate and test hypotheses, choose methods, substantiate conclusions, and present results.
PH13. Develop and teach special disciplines on information systems and technologies in institutions of higher education.
Program learning outcomes are determined by the Higher Education Institution
PH14. To design, organize the implementation, use, and support of distributed and intelligent information systems of various kinds based on the analysis of organizational needs and capabilities.
Required prior and related subjects: Distributed information systems
Information resource integration technologies
Information technologies of computer networks
Summary of the subject: Concept and definition of Big Data. Development of the Big Data concept. Big data analysis techniques. Big data analysis technologies. Big data storage models. MapReduce computing model. Big data processing tools. Application of Big Data in the various domains.
Опис: 1. The concept of Big Data
Concept and definition of Big Data. Properties of Big Data. Requirements for Big Data. The specifics of Big Data. Classification of big data. Structured data. Sources of big structured data. Relational databases in big data. Unstructured data. Sources of unstructured data. The role of CMS in big data management. Management of heterogeneous data. Integrating different types of data into a big data environment.
2. Evolution of Big Data.
The evolution of data management.
Step 1: Create managed data structures
Stage 2: Website and Content Management
Stage 3: Big Data Management
Processing large volumes of data on MainFrame. Prerequisites and factors of emergence of the direction of Big Data. Formation and development of Big Data technologies. Subject areas of application of big data. The current state and prospects for the development of Big Data.
3. Big data analysis techniques.
A/B testing. Classification. Cluster analysis. Crowdsourcing (data selection). Data migration and integration. Data mining. Determination of agreement (harmony) of data. Genetic algorithms. Machine learning. Natural language processing. Network analysis. Optimization. Pattern recognition. Predictive modeling. Regression analysis. Signal processing. Spatial data analysis. Statistics. Imitation modeling (Simulation). Analysis of time sequences. Study of associative links. Study of functional relationships. Study of hidden connections.
4. Big data management technologies
Operational databases. Relational DBMS in a big data environment (relational database - SQL).
Non-relational DBMS (Key-value databases. Document databases. Columnar databases. Graphical databases. Spatial databases).
Specialized Big Data repositories.
Streaming data.
5. MapReduce calculation model
The MapReduce paradigm. Origins of MapReduce.
Principles of the Map function. Principles of the Reduce function. Combination of Map and Reduce functions.
Optimization of MapReduce tasks. Equipment/network topology. Data synchronization
MapReduce file system.
6. Big data processing tools
Big data processing system Hadoop. Principles of Hadoop. Hadoop Distributed File System (HDFS). HDFS name vertices. HDFS data representation. Hadoop and MapReduce. The Hadoop Ecosystem. Building a big data resource with the Hadoop ecosystem
Hadoop YARN resource and application management tool. HBase Big Data Storage Facility. Exploring Hive Big Data.
7. Big data analytics
Definition of big data analysis. Using big data to get results.
Basic analytics. Advanced analytics. Operational analytics. Descriptive analytics. Forecast (predictive) analytics. Recommendation (prescriptive) analytics. Monetization of analytics
8. Application of Big Data in subject areas.
Environmental monitoring. Social processes. State administration. Marketing. Trade. E-commerce. Medicine. Exchange activity. Policy.
Assessment methods and criteria: • Current control (45%): written reports on laboratory work, abstract, oral questioning
• Final control (55%, exam): Written-oral form.
Критерії оцінювання результатів навчання: Current control - performance, and defense of laboratory work, oral and frontal examination
Final control: oral survey, test control.
Порядок та критерії виставляння балів та оцінок: 100-88 points - certified with an “excellent” grade - High level: the student demonstrates an in-depth mastery of the conceptual and categorical apparatus of the discipline, systematic knowledge, skills and abilities of their practical application. The mastered knowledge, skills and abilities provide the ability to independently formulate goals and organize learning activities, search and find solutions in non-standard, atypical educational and professional situations. The applicant demonstrates the ability to make generalizations based on critical analysis of factual material, ideas, theories and concepts, to formulate conclusions based on them. His/her activity is based on interest and motivation for self-development, continuous professional development, independent research activities, implemented with the support and guidance of the teacher. 87-71 points - certified with a grade of “good” - Sufficient level: involves mastery of the conceptual and categorical apparatus of the discipline at an advanced level, conscious use of knowledge, skills and abilities to reveal the essence of the issue. Possession of a partially structured set of knowledge provides the ability to apply it in familiar educational and professional situations. Aware of the specifics of tasks and learning situations, the student demonstrates the ability to search for and choose their solution according to the given sample, to argue for the use of a particular method of solving the problem. Their activities are based on interest and motivation for self-development and continuous professional development. 70-50 points - certified with a grade of “satisfactory” - Satisfactory level: outlines the mastery of the conceptual and categorical apparatus of the discipline at the average level, partial awareness of educational and professional tasks, problems and situations, knowledge of ways to solve typical problems and tasks. The applicant demonstrates an average level of skills and abilities to apply knowledge in practice, and solving problems requires assistance, support from a model. The basis of learning activities is situational and heuristic, dominated by motives of duty, unconscious use of opportunities for self-development. 49-00 points - certified with a grade of “unsatisfactory” - Unsatisfactory level: indicates an elementary mastery of the conceptual and categorical apparatus of the discipline, a general understanding of the content of the educational material, partial use of knowledge, skills and abilities. The basis of learning activities is situational and pragmatic interest.
Recommended books: 1. White, Tom // Hadoop: The Definitive Guide // O'Reilly Media, 2009.
2. Hadoop. Apache Software Foundation // http://hadoop.apache.org/
3. Finley, Klint // Steve Ballmer on Microsoft's Big Data Future and More in This Week's Business Intelligence Roundup // ReadWriteWeb, 2011.
4. Fay Chang, Jeffrey Dean, Sanjay Ghemawat &, etc. // Bigtable: A Distributed Storage System for Structured Data // Google Lab, 2006.
6. Jeffrey Dean, Sanjay Ghemawat // MapReduce: Simplified Data Processing on Large Clusters // Google Inc., 2004.
7. Judy Qiu // Cloud Technologies and Their Applications // Indiana University Bloomington, 2010
8. The Hadoop Distributed File System: Architecture and Design // http://hadoop.apache.org/common/docs/r0.17.2/hdfs_design.html
9. Ralf Lammel // Google’s MapReduce Programming Model — Revisited // Microsoft Corp.
Уніфікований додаток: Lviv Polytechnic National University ensures the realization of the right of persons with disabilities to obtain higher education. Inclusive educational services are provided through the Service of accessibility to learning opportunities "Without restrictions", the purpose of which is to provide permanent individual support for the educational process of students with disabilities and chronic diseases. An important tool for the implementation of the inclusive educational policy at the University is the Program for improving the qualifications of scientific and pedagogical workers and educational and support staff in the field of social inclusion and inclusive education. Contact at:
St. Starosolskikh, 2/4, 1st academic building, room 112
E-mail: nolimits@lpnu.ua
Websites: https://lpnu.ua/nolimits https://lpnu.ua/integration
Академічна доброчесність: The policy regarding the academic integrity of the participants in the educational process is formed on the basis of compliance with the principles of academic integrity, taking into account the norms "Regulations on academic integrity at the Lviv Polytechnic National University" (approved by the academic council of the university on June 20, 2017, protocol No. 35).