GitHub - Skip06/webScrapper: A high-performance, asynchronous web scraper built with Python, utilizing a modular "Fetcher-Parser-Storage" architecture. Focused on systems efficiency using HTTPX for concurrency and Selectolax for low-latency HTML parsing.

Architecture

Fetcher: Asynchronous HTTP/2 networking using HTTPX for connection pooling.

Parser: Low-latency HTML extraction using the C-based Selectolax engine.

Models: Strict data validation and schema enforcement via Pydantic.

Manager: uv (Rust-based package resolver)

Automation: Makefile (Task orchestration)

Logging: Loguru (Structured system logs)

1. Installation

git clone https://github.com/Skip06/kernel-scraper.git
cd kernel-scraper
uv sync

2. Execution

make run    # Execute the scraper
make clean  # Purge __pycache__ and local data

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
core		core
models		models
scrapper		scrapper
.gitignore		.gitignore
.python-version		.python-version
Makefile		Makefile
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Installation

2. Execution

About

Uh oh!

Releases

Packages

Languages

Skip06/webScrapper

Folders and files

Latest commit

History

Repository files navigation

1. Installation

2. Execution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages