πŸ“ Check List#

main_image#

Acknowledgement#

κ°€μ§œμ—°κ΅¬μ†ŒλŠ” DataCamp의 후원을 λ°›μ•„ Donates ν”„λ‘œκ·Έλž¨μ„ μ§„ν–‰ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. ν”„λ‘œκ·Έλž¨μ„ 톡해 ꡬ직자, λΆˆμ™„μ „ μ·¨μ—…μž, λΉ„μ˜λ¦¬ 연ꡬ κ³Όν•™μž, ν•™μƒλΆ„λ“€κ»˜ DataCampμ—μ„œ μ œκ³΅ν•˜λŠ” λ‹€μ–‘ν•œ μ½”μŠ€μ™€ νŠΈλž™μ„ μ œκ³΅ν•©λ‹ˆλ‹€. λ³Έ ν”„λ‘œμ νŠΈλŠ” DataCamp Donates ν”„λ‘œκ·Έλž¨ 쀑 ν•˜λ‚˜μΈ Data Science FellowshipμœΌλ‘œλΆ€ν„° μ‹œμž‘λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

μ•ˆλ…•ν•˜μ„Έμš” κ°€μ§œμ—°κ΅¬μ†Œ DE4EνŒ€μž…λ‹ˆλ‹€.

DE4EλŠ” 데이터 뢄석가, 데이터 κ³Όν•™μž, 데이터 μ—”μ§€λ‹ˆμ–΄, λ¨Έμ‹ λŸ¬λ‹ μ—”μ§€λ‹ˆμ–΄κ°€ ν•¨κ»˜ λͺ¨μ—¬ λ°μ΄ν„°μ˜, 데이터에 μ˜ν•œ, 데이터λ₯Ό μœ„ν•œ Data Engineering Repositoryλ₯Ό λ§Œλ“€μ–΄ λ‚˜κ°€λŠ” νŒ€μž…λ‹ˆλ‹€.

λ³Έ νŽ˜μ΄μ§€λŠ” 데이터 μ—”μ§€λ‹ˆμ–΄λ§μ„ 막 μ‹œμž‘ν•˜λŠ” μ£Όλ‹ˆμ–΄ 데이터 μ—”μ§€λ‹ˆμ–΄, 데이터 뢄석가, 데이터 κ³Όν•™μžλ₯Ό μœ„ν•œ 데이터 μ—”μ§€λ‹ˆμ–΄λ§ 체크리슀트λ₯Ό ν¬ν•¨ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

단계별, μ£Όμ œλ³„λ‘œ λΆ€μ‘±ν•œ 뢀뢄이 μžˆμœΌμ‹€ 경우 ν•΄λ‹Ή 주제λ₯Ό ν•™μŠ΅ν•˜κ³  배경지식을 μŒ“μ€ 뒀에 DE4Eλ₯Ό μ°Έκ³ ν•˜μ‹œλŠ” 것을 μΆ”μ²œλ“œλ¦½λ‹ˆλ‹€.

ν•œ 가지 기쁜 μ†Œμ‹μ€ DE4EλŠ” λͺ¨λ‘λ₯Ό μœ„ν•œ 데이터 μ—”μ§€λ‹ˆμ–΄λ§μ„ 주제둜 μž‘μ„±λœ ν”„λ‘œμ νŠΈμ΄κΈ°μ— λŒ€λΆ€λΆ„μ˜ 주제λ₯Ό 보닀 쉽고 λͺ…ν™•ν•˜κ²Œ 전달해 λ“œλ¦΄ μ˜ˆμ •μž…λ‹ˆλ‹€ πŸ˜ƒ

그럼 λ‹¨κ³„λ³„λ‘œ 핡심 μ£Όμ œλ“€μ„ ν•˜λ‚˜μ”© μ‚΄νŽ΄λ³ΌκΉŒμš”?

Fundamentals#

  • CS Fundamentals

  • Introduction to DE4E: Data Engineering for Everybody

  • Introduction to Data Engineering

  • Introduction Shell Programming and Data Processing in Shell

  • Introduction to Bash Scripting

  • Python Programming

  • Introduction to Relational Databases in SQL

  • Pandas for data processing

  • Database Design

  • Introduction to Apache Airflow

  • Introduction to PySpark

Intermediate#

  • Efficient Python Code

  • Writing Function in Python

  • Unit Testing for Data Science in Python

  • OOP(Object-Oriented Programming) in Python

  • Big Data Fundamentals with PySpark

  • Data Analysis in SQL

  • Messaging

  • Monitoring

  • Networking

Advanced#

  • Cleaning Data with PySpark

  • Introduction to IaaC(Infrastructure as Code)

  • Introduction to CI/CD(Continuous Integration and Continuous Delivery)

  • Introduction to Data security & Privacy

  • Introduction to DevOps

  • Introduction to DataOps

  • Introduction to Data Visualization

  • Machine Learning Fundamentals

  • Machine Learning Ops

Background Knowledge#

  • About Data Engineering

  • Data Literacy

  • Data Analyst vs Data Engineer vs Data Science

  • Data Engineer’s responsibilities

  • Structured Data, Semi-Structured Data and Unstructured Data

  • OLTP vs OLAP

  • ETL, ELT and Reverse ETL

  • Change Data Capture(CDC)

  • Data Lake vs Data warehouse

  • Lake house

  • Data engineers process

  • Batch Data vs Streaming Data

  • Batch processing vs Stream processing

  • Scheduling

  • Hadoop Ecosystem

  • Parallel computing

  • Introduction to Cloud Computing