Korea Apartment
Price Analysis

Data Analysis Project Based on Public Open Data

Project Overview

This project analyzes nationwide new apartment sale price data from 2013 to 2019, sourced from Korea's Public Data Portal. A total of 4,692 records (two datasets merged via pd.melt() + pd.concat()) were collected to examine price trends per pyeong across 17 metropolitan regions. Key insights were derived through data preprocessing and exploratory analysis using Python and Pandas.

Data Pipeline

1
Data Collection
Two CSV datasets sourced from Korea's Public Data Portal:
  • Dataset A — 2013.9–2015.8 (17 rows × 22 cols, wide format)
  • Dataset B — 2015.10–2019.12 (4,335 rows × 5 cols, long format)
⬇ Dataset A (.csv) ⬇ Dataset B (.csv)
2
Data Cleaning
  • Resolved EUC-KR / CP949 character encoding
  • Price column stored as object → converted to float64 via pd.to_numeric()
  • Identified 277 missing values (NaN) in the price field
# encoding fix + type conversion df = pd.read_csv("data.csv", encoding="cp949") df["price"] = pd.to_numeric(df["price_raw"], errors="coerce")
3
Feature Engineering
  • Price per ㎡ → price per pyeong (× 3.3)
  • Parsed apartment size categories from Korean text strings
  • Dropped redundant source columns
# unit conversion + text parsing df["price_pyeong"] = df["price"] * 3.3 df["area"] = df["size_class"].str.replace("...", "")
4
Data Integration
  • Wide-format Dataset A → long format via pd.melt()
  • Parsed year/month from Korean date strings
  • Merged both datasets with pd.concat() (4,692 rows)
# reshape + merge df_melt = df_first.melt(id_vars=["region"]) df = pd.concat([df_first_prep, df_last_prep])
5
Analysis
  • groupby("region") → regional averages
  • groupby("year") → yearly trends
  • groupby("area") → size category comparison
# aggregation df.groupby(["region"])["price_pyeong"].mean()
Raw Data Schema (Before)
5 columns · 4,335 rows · Dataset B
ColumnTypeNon-Null
Regionobject4,335
Size Classobject4,335
Yearint644,335
Monthint644,335
Price (㎡)object4,058
Processed Schema (After)
6 columns · 4,335 rows · cleaned & engineered
ColumnTypeNon-Null
Regionobject4,335
Yearint644,335
Monthint644,335
Pricefloat643,957
Price / Pyeongfloat643,957
Area Typeobject4,335

Key Statistics

4,692
Total Records
17
Regions Analyzed
7 yrs
Analysis Period (2013–2019)
30.5%
National Avg. Price Increase

Data Visualization

Average Price per Pyeong by Region

Yearly Trend: National vs Seoul (2013–2019)

Average Price by Apartment Size

Yearly Price Distribution

Price Distribution (KRW per Pyeong)

Distribution of 3,957 price records across 17 regions (2015–2019). Right-skewed with outlier at 42,002.

Price Trend by Apartment Size (2015–2019)

Regional Price Heatmap (2013–2019)

Average price per pyeong by region and year. Darker blue = higher price. Combined from two datasets via pd.melt() + pd.concat().

Key Insights

Seoul's average price per pyeong is approximately twice the national average, reflecting a clear concentration in the capital region.
Combining both datasets reveals the national average rose by 51.2% from 2013 to 2019 (8,059 to 12,188 KRW/pyeong), accelerating sharply after 2017.
Seoul recorded a 32.5% increase over the same period, outpacing the national average growth rate.
Larger apartments (102㎡+) command a 12% premium over the overall average, while mid-size units (60–85㎡) are the most affordable.
Price distribution widened significantly in 2019 — the max reached 42,002 KRW/pyeong while the minimum barely moved, indicating growing inequality.
The highest recorded price was 42,002 KRW/pyeong in Seoul for 85–102㎡ apartments (Jun–Dec 2019), nearly 4× the national median of 10,686.

Tools & Technologies

Python Pandas Jupyter Notebook Matplotlib Seaborn Korea Public Data Portal Chart.js