Scrapix is a highly customizable web scraper that allows users to create, manage, and execute complex web scraping workflows with a visual, no-code/low-code interface. Built with modern tools, Scrapix simplifies web scraping tasks and integrates advanced features like AI-based data extraction. Whether you're automating repetitive tasks or extracting structured data from websites, Scrapix has you covered.
- React: A JavaScript library for building user interfaces.
- Next.js: A React framework for server-side rendering and static site generation.
- TypeScript: A typed superset of JavaScript that compiles to plain JavaScript.
- Tailwind CSS: A utility-first CSS framework for styling.
- Radix UI: A set of accessible and unstyled components for building high-quality web interfaces.
- Framer Motion: A library for animations in React.
- Embla Carousel: A library for creating carousels in React.
- React Query: A library for fetching, caching, and updating asynchronous data in React.
- Recharts: A charting library for React.
- Sonner: A library for toast notifications in React.
- React Day Picker: A library for date picking in React.
- Node.js: A JavaScript runtime built on Chrome's V8 JavaScript engine.
- Prisma: An ORM (Object-Relational Mapping) tool for Node.js and TypeScript.
- PostgreSQL: A relational database management system.
- Puppeteer: A Node library for controlling headless Chrome or Chromium.
- Stripe: A library for handling payments and billing.
- OpenAI: A library for interacting with OpenAI's GPT models.
- Clerk: A library for authentication and user management.
- Class Variance Authority (CVA): A library for managing class names in a type-safe way.
- Clsx: A utility for constructing className strings conditionally.
- Tailwind Merge: A utility for merging Tailwind CSS classes.
- Cron Parser: A library for parsing cron expressions.
- Date-fns: A library for manipulating dates in JavaScript.
- Zod: A TypeScript-first schema declaration and validation library.
- ESLint: A tool for identifying and fixing problems in JavaScript code.
- PostCSS: A tool for transforming CSS with JavaScript plugins.
- Tailwind CSS: A utility-first CSS framework.
- TypeScript: A typed superset of JavaScript that compiles to plain JavaScript.
- Launch Browser: Opens a browser instance to begin the web scraping process and interact with web pages.
- Navigate to URL: Directly navigate to a specified URL to scrape or interact with the content on the page.
- Page to HTML: Capture the complete HTML content of the page for detailed analysis and processing.
- Extract Text from Element: Easily extract text content from a specific HTML element using a CSS selector.
- Extract Data via AI: Leverage AI to intelligently parse HTML content and extract structured data based on a custom prompt, returning results as JSON.
- Read JSON: Retrieve and utilize specific keys or properties from a JSON object within your workflow.
- Build JSON: Add, modify, or create new data within a JSON object, enabling dynamic workflow customization.
- Fill Input: Automatically fill input fields with predefined values, simulating user input for forms or search bars.
- Click Element: Simulate click actions on specific HTML elements, enabling navigation or triggering events on the page.
- Scroll to Element: Simulate scrolling to a specified element, useful for pages with infinite scrolling or dynamic content loading.
- Wait for Element: Pause the workflow until a specified element is visible or hidden on the page, ensuring reliable scraping.
- Deliver via Webhook: Send the scraped data to an external API endpoint via a POST request, enabling seamless integration with other tools and services.
-
Clone the repository:
git clone https://github.com/immortalsul/scrapix.git
-
Install dependencies:
cd scrapix npm install -
Set up your environment variables in
.env(refer to the example in.env.example). -
Run the app locally:
npm run dev
Visit http://scarpix.troikahub.tech to start using Scrapix.
We welcome contributions! If you'd like to improve Scrapix, feel free to fork the repository, create a branch, and submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.