Skip to content

draemonsi/data-analysis-with-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Experiment 3: Python Data Analysis (PANDAS)

Course: ECE 2112 - Advanced Computer Programming and Algorithms

Institution: University of Santo Tomas, Faculty of Engineering, Electronics Engineering Department

Project Overview

This experiment focuses on understanding and applying various DataFrame manipulation techniques using the Pandas library in Python. The goal is to implement and solve the following problems related to slicing, indexing, and subsetting a DataFrame:

  1. Extract odd-numbered rows and columns from a given DataFrame.
  2. Retrieve specific rows and columns using indexing.
  3. Query and manipulate the data based on specific conditions.

Table of Contents

  1. Intended Learning Outcomes
  2. Problem Descriptions
  3. Installation Instructions
  4. Usage
  5. Files Included
  6. Technologies Used
  7. License
  8. Author

Intended Learning Outcomes

  1. Understand the basics of DataFrame indexing and slicing in Pandas.
  2. Apply various DataFrame operations such as selecting specific rows, filtering data, and applying conditions.
  3. Retrieve data based on column and row conditions using subsetting and slicing techniques.

Problem Descriptions

Problem 1

  • Objective: Load the corresponding .csv file into a data frame named cars using pandas.
    • Task:
      • Use pd.read_csv to load the cars.csv
    • Code sample:
cars = pd.read_csv('cars.csv')
  • Objective: Display the first five and last five rows of the resulting cars.
    • Task:
      • Use pd.concat to concatenate the first five and last rows: cars.head(), cars.tail()
    • Code sample:
print(pd.concat([cars.head(), cars.tails()])

Problem 2

  • Objective: Extract the first five odd-numbered columns (1, 3, 5, 7...) from the given DataFrame.
    • Task:
      • Use the Pandas iloc function to select odd-numbered columns.
      • Display the first five odd-numbered rows.
    • Code sample:
print(cars.iloc[:5,::2]
  • Objective: Extract the row that contains the 'Model' of 'Mazda RX4'.
    • Task:
      • Use the Pandas loc function to retrieve the row containing the car model 'Mazda RX4'.
    • Code sample:
print(cars[cars['Model'] == 'Mazda RX4']
  • Objective: Determine how many cylinders ('cyl') the car model 'Camaro Z28' has.
    • Task:
      • Filter the DataFrame to find the row for 'Camaro Z28' and retrieve the number of cylinders.
    • Code sample:
print(cars[cars['Model'] == 'Camaro Z28']['cyl']
  • Objective: Retrieve the number of cylinders ('cyl') and gear type ('gear') for the car models: 'Mazda RX4 Wag', 'Ford Pantera L', and 'Honda Civic'.
    • Task:
      • Use the Pandas isin() method to filter for these models and display the required columns.
    • Code sample:
models = ['Mazda RX4 Wag', 'Ford Pantera L', 'Honda Civic']
selected_cars = cars[cars['Model'].isin(models)][['Model', 'cyl', 'gear']]
print(selected_cars)

Installation Instructions

To run the provided Python code, ensure you have the following installed:

  1. Python (Version 3.6 or higher)
  2. Jupyter Notebook
  3. NumPy library

Installation steps:

  1. Clone the repository:
    git clone https://github.com/draemonsi/data-analysis-with-python.git
  2. Install dependencies (if Pandas is not installed):
    pip install pandas

Usage

  1. Open the Python environment (Jupyter Notebook, VS Code, etc.).
  2. Load the provided dataset into a Pandas DataFrame.
  3. Run the code snippets for each problem sequentially.
  4. Upon running the code, you'll see the output directly in your environment.

Files included

  • Simon_Pandas-P1.py: Python file containing the solution to Problem 1
  • Simon_Pandas-P2.py: Python file containing the solution to Problem 2
  • cars.csv: Excel file containing the data of cars

Technologies Used

  • Python (version 3.x)
  • Pandas (Python Data Analysis Library)
  • Jupyter Notebook for code execution and analysis

License

This project is licensed under The Unlicense. Please see LICENSE file for more details.


Author

Andrei Jorelle C. Simon
GitHub Profile

About

Python Data Analysis

Topics

Resources

License

Stars

Watchers

Forks

Languages