LEARN COMPLETE PYTHON IN 24 HOURS
🟦 Table of Contents – Master Data Science with Python
🔹 1. Introduction to Data Science & Python Setup
1.1 What is Data Science and Why Python
1.2 Data Science Career Paths
1.3 Python Environment Setup
1.4 Essential Libraries Overview
🔹 2. NumPy – Foundation of Numerical Computing
2.1 NumPy Arrays vs Python Lists
2.2 Array Operations, Broadcasting & Vectorization
2.3 Indexing, Slicing & Array Manipulation
2.4 Mathematical & Statistical Functions
🔹 3. Pandas – Data Manipulation & Analysis
3.1 Series and DataFrame
3.2 Data Loading
3.3 Data Cleaning & Transformation
3.4 Grouping & Aggregation
3.5 Handling Missing Values & Outliers
🔹 4. Data Visualization with Matplotlib & Seaborn
4.1 Matplotlib Basics
4.2 Seaborn Visualization
4.3 Advanced Plots
4.4 Publication-Ready Visualizations
🔹 5. Exploratory Data Analysis (EDA)
5.1 Data Distribution & Summary Statistics
5.2 Univariate, Bivariate & Multivariate Analysis
5.3 Correlation Analysis
5.4 EDA Case Study
🔹 6. Data Preprocessing & Feature Engineering
6.1 Data Scaling & Normalization
6.2 Encoding Categorical Variables
6.3 Feature Selection
6.4 Handling Imbalanced Data
🔹 7. Statistics & Probability for Data Science
7.1 Descriptive vs Inferential Statistics
7.2 Hypothesis Testing
7.3 Probability Distributions
7.4 Correlation & Regression
🔹 8. Machine Learning with Scikit-learn
8.1 Supervised Learning
8.2 Model Training & Evaluation
8.3 Cross-Validation
8.4 Unsupervised Learning
🔹 9. Advanced Data Science Topics
9.1 Time Series Analysis
9.2 NLP Basics
9.3 Deep Learning Introduction
9.4 Model Deployment
🔹 10. Real-World Projects & Case Studies
10.1 House Price Prediction
10.2 Customer Churn Prediction
10.3 Sentiment Analysis
10.4 Sales Dashboard
🔹 11. Best Practices, Portfolio & Career Guidance
11.1 Clean Code Practices
11.2 Portfolio Building
11.3 Git & Resume Tips
11.4 Interview Preparation
🔹 12. Next Steps & Learning Roadmap
12.1 Advanced Topics
12.2 Books & Resources
12.3 Career Opportunities
3. Pandas – Data Manipulation & Analysis
Pandas is the most powerful and widely used Python library for data wrangling, cleaning, exploration, and analysis. It is built on top of NumPy and provides high-level data structures (Series and DataFrame) that make working with tabular/structured data feel like using Excel or SQL — but much more powerful.
Install Pandas (if not using Anaconda)
Bash
pip install pandas
Standard import (always use this)
Python
import pandas as pd
3.1 Series and DataFrame – Core Data Structures
Series A one-dimensional labeled array (like a column in Excel or a vector with labels).
Python
# Create Series from list s1 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']) print(s1) # a 10 # b 20 # c 30 # d 40 # dtype: int64 # Access by label print(s1['b']) # 20 # From dictionary s2 = pd.Series({'Math': 85, 'Science': 92, 'English': 78}) print(s2['Science']) # 92
DataFrame A two-dimensional labeled data structure (like a spreadsheet or SQL table) — the heart of Pandas.
Python
# Create DataFrame from dictionary data = { 'Name': ['Anshuman', 'Priya', 'Rahul', 'Sneha'], 'Age': [25, 23, 24, 22], 'City': ['Ranchi', 'Delhi', 'Patna', 'Kolkata'], 'Marks': [92, 88, 85, 90] } df = pd.DataFrame(data) print(df) # Name Age City Marks # 0 Anshuman 25 Ranchi 92 # 1 Priya 23 Delhi 88 # 2 Rahul 24 Patna 85 # 3 Sneha 22 Kolkata 90 # Basic inspection print(df.head(2)) # first 2 rows print(df.info()) # data types, non-null count print(df.describe()) # summary statistics print(df.shape) # (rows, columns) → (4, 4)
Quick access
Python
df['Name'] # Series – one column df[['Name', 'Marks']] # DataFrame – multiple columns df.iloc[0] # first row (position-based) df.loc[0, 'Name'] # label-based access
3.2 Data Loading (CSV, Excel, JSON, SQL)
Pandas makes reading data from almost any source effortless.
CSV
Python
df = pd.read_csv("sales_data.csv") # Options: skiprows=2, usecols=['date','sales'], dtype={'sales':float}
Excel
Python
df = pd.read_excel("report.xlsx", sheet_name="Sales", skiprows=1) # Need: pip install openpyxl or xlrd
JSON
Python
df = pd.read_json("data.json") # or pd.json_normalize() for nested JSON
SQL (with database connection)
Python
import sqlalchemy as sa engine = sa.create_engine("sqlite:///mydb.db") df = pd.read_sql("SELECT * FROM customers", engine) # or pd.read_sql_query(query, engine)
Quick save
Python
df.to_csv("cleaned_data.csv", index=False) df.to_excel("report.xlsx", index=False) df.to_json("data.json", orient="records")
3.3 Data Cleaning, Filtering & Transformation
Real data is messy — Pandas excels at cleaning it.
Basic cleaning
Python
df = df.drop_duplicates() # remove duplicate rows df = df.dropna(subset=['age', 'city']) # drop rows with missing values df['age'] = df['age'].fillna(df['age'].median()) # fill missing with median df['salary'] = df['salary'].astype(float) # change data type df['date'] = pd.to_datetime(df['date']) # convert to datetime
Filtering
Python
high_salary = df[df['salary'] > 80000] young_delhi = df[(df['age'] < 30) & (df['city'] == 'Delhi')] top_10 = df.nlargest(10, 'marks')
Transformations
Python
df['tax'] = df['salary'] * 0.18 # new column df['full_name'] = df['first'] + " " + df['last'] df['salary_category'] = pd.cut(df['salary'], bins=[0, 50000, 100000, np.inf], labels=['Low', 'Medium', 'High'])
3.4 Grouping, Aggregation & Pivot Tables
Groupby – most powerful feature
Python
# Group by city and calculate mean salary df.groupby('city')['salary'].mean() # Multiple aggregations df.groupby('city').agg({ 'salary': ['mean', 'max', 'count'], 'age': 'median' })
Pivot Tables
Python
pd.pivot_table(df, values='salary', index='city', columns='department', aggfunc='mean', fill_value=0)
Crosstab
Python
pd.crosstab(df['city'], df['gender'], margins=True)
3.5 Handling Missing Values & Outliers
Missing values
Python
# Check missing df.isnull().sum() # Fill missing df['age'].fillna(df['age'].median(), inplace=True) df['city'].fillna('Unknown', inplace=True) # Drop missing df.dropna(subset=['salary'], inplace=True)
Detect & handle outliers
Python
# Using IQR method Q1 = df['salary'].quantile(0.25) Q3 = df['salary'].quantile(0.75) IQR = Q3 - Q1 lower = Q1 - 1.5 IQR upper = Q3 + 1.5 IQR # Remove outliers df_clean = df[(df['salary'] >= lower) & (df['salary'] <= upper)] # Or cap them df['salary'] = df['salary'].clip(lower=lower, upper=upper)
Mini Summary Project – Quick EDA on Sample Dataset
Python
import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Load sample (or your own CSV) df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv") # Quick look print(df.head()) print(df.info()) print(df.describe()) # Missing values print(df.isnull().sum()) # Group analysis print(df.groupby('day')['total_bill'].mean()) # Visualization sns.boxplot(x='day', y='total_bill', data=df) plt.title("Total Bill by Day") plt.show()
This completes the full Pandas – Data Manipulation & Analysis section — the most important tool for real data science work in Python!
📚 Amazon Book Library
All my books are FREE on Amazon Kindle Unlimited🌍 Exclusive Country-Wise Amazon Book Library – Only Here!
On GlobalCodeMaster.com you’ll find complete, ready-to-use lists of my books with direct Amazon links for every country.
Belong to India, Australia, USA, UK, Canada or any other country? Just click your country’s link and enjoy:
✅ Any eBook FREE on Kindle Unlimited ✅ Or buy at incredibly low prices
400+ fresh books written in 2025-2026 with today’s latest AI, Python, Machine Learning & tech trends – nowhere else will you find this complete country-wise collection on one platform!
Choose your country below and start reading instantly 🚀
BOOK LIBRARY USA 2026 LINK
BOOK LIBRARY INDIA 2026 LINK
BOOK LIBRARY AUSTRALIA 2026 LINK
BOOK LIBRARY CANADA 2026 LINK
BOOK LIBRARY UNITED KINGDOM 2026 LINK
BOOK LIBRARY GERMANY 2026 LINK
BOOK LIBRARY FRANCE 2026 LINK
BOOK LIBRARY ITALY 2026 LINK
BOOK LIBRARY SPAIN 2026 LINK
BOOK LIBRARY NETHERLANDS 2026 LINK
BOOK LIBRARY BRAZIL 2026 LINK
BOOK LIBRARY MEXICO 2026 LINK
BOOK LIBRARY JAPAN 2026 LINK
BOOK LIBRARY POLAND 2026 LINK
BOOK LIBRARY IRELAND 2026 LINK
BOOK LIBRARY SWEDEN 2026 LINK
BOOK LIBRARY BELGIUM 2026 LINK
Email-ibm.anshuman@gmail.com
© 2026 CodeForge AI | Privacy Policy |Terms of Service | Contact | Disclaimer | 1000 university college list|book library australia 2026
All my books are exclusively available on Amazon. The free notes/materials on globalcodemaster.com do NOT match even 1% with any of my PUBLISHED BOoks. Similar topics ≠ same content. Books have full details, exercises, chapters & structure — website notes do not.No book content is shared here. We fully comply with Amazon policies.
🚀 Best content for SSC, CGL, LDC, TET, NET & SET preparation!
📚 Maths | Reasoning | GK | Previous Year Questions | Tips & Tricks
👉 Join our WhatsApp Channel now:
🔗 https://whatsapp.com/channel/0029Vb6kg2vFnSz4zknEOG1D...