GitXplorerGitXplorer
l

rss-scraping

public
9 stars
0 forks
1 issues

Commits

List of commits on branch main.
Verified
a2c5a9ea06eaef7946b6cfdb2e7d175f7dd0d96e

Update README.md

llwindolf committed 6 months ago
Verified
2ced8f12e66b896d2ee0d53e08021d2ca34dcb08

Update README.md

llwindolf committed a year ago
Verified
fa20e5f177905e2b505124a3718cb91f20198943

Create mangakatana-chapters.xsl

llwindolf committed a year ago
Verified
d139945353abcfdcd8dd07b42e6a83433cfde222

Fix typo

llwindolf committed a year ago
Verified
af78306dce62d8ae5475be40105f1893bc37e40b

Update README.md

llwindolf committed 2 years ago
Verified
ce3b76e8362a800efc7fab18b33489968e553f63

Update README.md

llwindolf committed 2 years ago

README

The README file for this repository.

RSS website scraping scripts

This repository serves as a guide on RSS/Atom feed scraping solutions. While this repos hosts a few scripts and examples to build upon, it mostly provides links to existing solutions.

This repo does focus on simple (almost) zero-setup solutions!

Example scraping scripts

The examples folder provides a few scripts that illustrate how to write a scraper in different scripting languages. It does focus on languages that can be run out of the box on all Linux distributions so you can use those scrapers with feed readers that support running scripts as sources like Liferea and SnowNews.

OSS scraping scripts

This is a list of simple scripts you can run locally. You can use them with any desktop feed reader that can run local commands.

Tool Input Extraction Output Details
sjehuda/html2atom HTML XPath Atom Python Script
h43z/rssify HTML CSS selectors RSS Python script
Tweeper Twitter auto RSS PHP script
MixCloud HTML auto RSS PHP script

Commercial scraping solutions

These are 3rd party services usually provided by companies that offer subscriptions. List is roughly ordered by usefulness and simplicity of the services. When using free plans consider your privacy!

Tool Input Extraction Output Sign Up Details
rsshub.app Many social networks auto RSS no Simple link syntax e.g. https://rsshub.app/<service>/user/<user name>
nitter.com Twitter auto RSS no Simple link syntax https://nitter.net/<twitter username>/rss
feed43.com Any website string pattern RSS no Free for non-commercial use. Allows to specify patterns to extract
fivefilters.org Any website CSS selectors RSS no Returns only 5 most recent items per feed
RSS.app Many social networks auto RSS yes
fetchrss Any website visual assistant RSS yes 4 feeds are free
Google Search Google Search API Query RSS/Atom yes 100 requests per day, API key necessary

If you find a service/link broken or missing please create a PR!