Skip to main content

Built in the open

Proxidize maintains free, open-source scraping tools for the developer community. Use them standalone or pair them with Proxidize proxies for reliable, unblocked data collection at scale.

github.com/proxidize

Reddit Scraper

PythonMIT

Scrape any subreddit, user, or post — no API key needed.

A production-ready Python scraper for Reddit built around the public JSON endpoints. Supports proxy rotation, captcha solving, rate limiting, and full comment thread extraction out of the box.

  • JSON endpoint scraping — no Reddit API key required
  • Automatic proxy rotation with health monitoring
  • Captcha solving via Capsolver integration
  • Full comment thread extraction with nested replies
  • JSON and CSV export with standardised field names
  • Docker support and rich CLI with progress bars

X / Twitter Scraper

PythonMIT

Scrape timelines, keywords, and date ranges — no official API needed.

A Python scraper for X (Twitter) powered by Playwright for browser automation and OpenAI for AI-powered tweet analysis. Collect tweets, engagement metrics, and media without the official API.

  • Timeline, keyword, and historical date-range scraping
  • AI sentiment analysis, topic extraction, and summaries via OpenAI
  • Checkpoint and resume for interrupted scrapes
  • Proxy support for IP rotation with residential and mobile proxies
  • Cookie-based session persistence for reliable long-term scraping
  • Rich metadata: engagement metrics, media, hashtags, and URLs

Our philosophy

Why we build open source

Proxidize builds open-source scraping tools to help developers test, prototype, and understand real-world data collection workflows before scaling them.

Our tools are built around practical scraping problems like rate limits, IP blocks, retries, session handling, CAPTCHA challenges, and structured data exports.

You can use these projects for free, modify them for your own workflow, or use them as a starting point for larger scraping and automation systems. When your workload grows, Proxidize helps you scale those workflows with managed mobile and residential proxies.

Scale your workflow

Scale open-source scraping with Proxidize

Most scraping projects start simple. But as request volume grows, developers often run into rate limits, IP blocks, CAPTCHA challenges, and unstable sessions.

Proxidize helps you scale scraping workflows with managed mobile and residential proxies, rotating and sticky sessions, city and carrier targeting, and support for HTTP(S), SOCKS5, and UDP over SOCKS.

Use the open-source tools to prototype your workflow, then connect Proxidize proxies when you need more reliable infrastructure for production.

Recognition

Trusted. Certified. Recognized.

SOC 2 Type II certified. ISO 27001 certified. Every audit passed, every standard met.

FAQ

Got questions?
We've got answers.

Common questions about Proxidize open-source scraping tools.

Yes. Proxidize open-source projects are free to use and available on GitHub. You can download them, modify them, and use them as a starting point for your own scraping or automation workflows.
No. The tools can be used without Proxidize proxies. However, using proxies can help when you need to handle rate limits, avoid IP blocks, manage sessions, or scale larger scraping jobs.
It depends on the use case. Mobile proxies are usually better for workflows that need trusted mobile carrier IPs, sticky sessions, or social media automation. Residential proxies are often better for broad web scraping, SEO monitoring, price tracking, and data collection across many locations.
Yes, but you should review and adapt the code for your own production requirements. For larger workloads, you may need proxy rotation, retries, logging, queue management, CAPTCHA handling, and compliance checks.
Yes. These projects are maintained by Proxidize as open-source tools for developers, scraping teams, and automation engineers.
Yes. Developers can open issues, suggest improvements, or submit pull requests through GitHub.
Yes. These tools are intended for lawful and ethical data collection. Always respect website terms, robots.txt rules, rate limits, privacy laws, and applicable regulations.