π Check List#
#
Acknowledgement#
κ°μ§μ°κ΅¬μλ DataCampμ νμμ λ°μ Donates νλ‘κ·Έλ¨μ μ§ννκ³ μμ΅λλ€. νλ‘κ·Έλ¨μ ν΅ν΄ ꡬμ§μ, λΆμμ μ·¨μ μ, λΉμ리 μ°κ΅¬ κ³Όνμ, νμλΆλ€κ» DataCampμμ μ 곡νλ λ€μν μ½μ€μ νΈλμ μ 곡ν©λλ€. λ³Έ νλ‘μ νΈλ DataCamp Donates νλ‘κ·Έλ¨ μ€ νλμΈ Data Science FellowshipμΌλ‘λΆν° μμλμμ΅λλ€.
μλ νμΈμ κ°μ§μ°κ΅¬μ DE4Eνμ λλ€.
DE4Eλ λ°μ΄ν° λΆμκ°, λ°μ΄ν° κ³Όνμ, λ°μ΄ν° μμ§λμ΄, λ¨Έμ λ¬λ μμ§λμ΄κ° ν¨κ» λͺ¨μ¬ λ°μ΄ν°μ, λ°μ΄ν°μ μν, λ°μ΄ν°λ₯Ό μν Data Engineering Repositoryλ₯Ό λ§λ€μ΄ λκ°λ νμ λλ€.
λ³Έ νμ΄μ§λ λ°μ΄ν° μμ§λμ΄λ§μ λ§ μμνλ μ£Όλμ΄ λ°μ΄ν° μμ§λμ΄, λ°μ΄ν° λΆμκ°, λ°μ΄ν° κ³Όνμλ₯Ό μν λ°μ΄ν° μμ§λμ΄λ§ 체ν¬λ¦¬μ€νΈλ₯Ό ν¬ν¨νκ³ μμ΅λλ€.
λ¨κ³λ³, μ£Όμ λ³λ‘ λΆμ‘±ν λΆλΆμ΄ μμΌμ€ κ²½μ° ν΄λΉ μ£Όμ λ₯Ό νμ΅νκ³ λ°°κ²½μ§μμ μμ λ€μ DE4Eλ₯Ό μ°Έκ³ νμλ κ²μ μΆμ²λ립λλ€.
ν κ°μ§ κΈ°μ μμμ DE4Eλ λͺ¨λλ₯Ό μν λ°μ΄ν° μμ§λμ΄λ§μ μ£Όμ λ‘ μμ±λ νλ‘μ νΈμ΄κΈ°μ λλΆλΆμ μ£Όμ λ₯Ό λ³΄λ€ μ½κ³ λͺ ννκ² μ λ¬ν΄ λ릴 μμ μ λλ€ π
κ·ΈλΌ λ¨κ³λ³λ‘ ν΅μ¬ μ£Όμ λ€μ νλμ© μ΄ν΄λ³ΌκΉμ?
Fundamentals#
CS Fundamentals
Introduction to DE4E: Data Engineering for Everybody
Introduction to Data Engineering
Introduction Shell Programming and Data Processing in Shell
Introduction to Bash Scripting
Python Programming
Introduction to Relational Databases in SQL
Pandas for data processing
Database Design
Introduction to Apache Airflow
Introduction to PySpark
Intermediate#
Efficient Python Code
Writing Function in Python
Unit Testing for Data Science in Python
OOP(Object-Oriented Programming) in Python
Big Data Fundamentals with PySpark
Data Analysis in SQL
Messaging
Monitoring
Networking
Advanced#
Cleaning Data with PySpark
Introduction to IaaC(Infrastructure as Code)
Introduction to CI/CD(Continuous Integration and Continuous Delivery)
Introduction to Data security & Privacy
Introduction to DevOps
Introduction to DataOps
Introduction to Data Visualization
Machine Learning Fundamentals
Machine Learning Ops
Background Knowledge#
About Data Engineering
Data Literacy
Data Analyst vs Data Engineer vs Data Science
Data Engineerβs responsibilities
Structured Data, Semi-Structured Data and Unstructured Data
OLTP vs OLAP
ETL, ELT and Reverse ETL
Change Data Capture(CDC)
Data Lake vs Data warehouse
Lake house
Data engineers process
Batch Data vs Streaming Data
Batch processing vs Stream processing
Scheduling
Hadoop Ecosystem
Parallel computing
Introduction to Cloud Computing