Optimizing US Real Estate Investments
Analyzing U.S. real estate listings to uncover market trends, pricing patterns, and investment opportunities through data visualization and statistical modeling.
Data
U.S. real estate listings sourced from Realtor.com via Kaggle
Audience
Real-estate investors and home buyers
Original Dataset:
Total Listings: 2,226,382 real estate listings
Geographic Scope: Included properties from the U.S., Guam, Puerto Rico, the District of Columbia, and New Brunswick
Refinement: Removed all listings outside the 50 U.S. states to focus on relevant real estate markets
Outlier Removal: To ensure data accuracy and relevance
Low-Price Listings: 22,070 listings priced below $50,000 were removed, as such low prices are typically unrealistic for solid real estate investments.
Extreme Bedroom/Bathroom Counts: Removed listings with more than 20 bedrooms or 20 bathrooms (e.g., one listing had 830 bathrooms, likely a hotel, not a home).
Zero Bed/Bath Listings: Excluded properties with 0 bedrooms and 0 bathrooms, as our focus is home sales, not land purchases.
Oversized Properties: Removed listings over 25,000 sq. ft, as these are either data errors or beyond the scope of average investors and homebuyers.
Final Dataset:
Total After Cleaning: 1,530,447 real estate listings across all 50 U.S. states
New Feature: Price Range Categorization
Low Range: Less than $300,000
Mid Range: Between $300,000 - $700,000
High Range: Greater than $700,000
This segmentation helps investors at different financial levels identify suitable opportunities.
Techniques used for the project:
Data Cleaning & Preprocessing
Exploratory Data Analysis
Data Visualization with Python and Tableau
Time-Series Analysis
Statistical Modeling
Hypothesis Testing
Exploratory Data Analysis (EDA) and Insights for Real Estate Pricing
To better understand the factors influencing property prices, we conducted an Exploratory Data Analysis (EDA) on the real estate dataset. Our primary goal was to identify relationships between price and key property characteristics, including the number of bedrooms, bathrooms, and house size (measured in square feet). The EDA process involved the following steps:
Univariate and Bivariate Analysis
Analyzed distributions of bedrooms, bathrooms, house sizes, and prices.
Created scatterplots to visualize relationships between these features and property prices.
Key Scatterplot Insights
Bedrooms vs. Price:
A positive relationship was observed; however, the increase in price slowed after four bedrooms, suggesting diminishing returns for larger homes.
Bathrooms vs. Price:
The number of bathrooms showed a stronger correlation with price than bedrooms, highlighting the added value of additional bathrooms, especially when increasing from one to two.
House Size vs. Price:
House size had the most direct and consistent relationship with price, with larger properties commanding significantly higher values.
Insights for Investors:
House size is the most reliable predictor of price, making it a critical factor in evaluating investment opportunities.
Additional bathrooms tend to increase property value more significantly than additional bedrooms.
Investors should prioritize listings with potential for bathroom additions or properties with above-average square footage for the neighborhood.
Real Estate Market Trends Since 2016
Steady Price Growth: Real estate listing prices have shown a consistent upward trend since 2016.
2020 Decline: A noticeable drop in listings occurred in 2020, likely influenced by the COVID-19 pandemic.
Peak in 2021: Listing prices reached an all-time high in 2021.
2022 Adjustment: The average price experienced a slight decline in 2022.
2023 Rebound: Prices saw a modest increase again in 2023.
Results and Recommendations
Optimal Timing for Investment:
Now is a strategic time to invest in real estate. Listing prices, which spiked in 2021, have since decreased but are expected to rise over time.
Key Property Features Affecting Price:
House Size: The most significant factor influencing property prices.
Number of Bathrooms: A strong correlation with higher property values, particularly when increasing from one to two bathrooms.
Number of Bedrooms: While impactful, the effect plateaus after four bedrooms.
Price Ranges with the Most Listings:
Mid-Range ($300,000 - $700,000): The majority of listings fall within this category.
Low-Range (Less than $300,000): There are still a significant number of properties are available at this price point. Perfect for first time investors or home buyers!
High-Range (Over $700,000): Fewer listings, but substantial investment opportunities.
Top States for Listings by Price Range:
Mid-Range: Florida, California, Texas.
Low-Range: Texas, Florida, Illinois, Pennsylvania.
High-Range: California, Florida, New York.
These insights provide a foundation for data-driven investment decisions, helping investors focus on high-impact features and regions with promising opportunities.
🔍 Next Steps for Analysis
To deepen insights, further analysis can explore:
🏙️ Regional & Urban vs. Rural Trends
Analyze state- and city-level data to compare urban vs. rural pricing.
Identify top cities for investment and assess long-term rural property appreciation.
📈 Market Conditions & Economic Factors
Examine interest rates, inflation, and COVID-19’s impact on housing prices.
🏡 Property Features & Value Drivers
Assess price differences by property type (single-family, condo, multi-unit).
Analyze the impact of square footage, lot size, and amenities on home value.