ManyPI lets you convert any website into a reliable, structured API without writing complex scraping code. Simply describe what data you want, and our AI handles the rest.
Perfect for: E-commerce monitoring, lead generation, content aggregation, market research, and any scenario where you need structured data from websites.
Goal: Monitor competitor prices and product availability.
1. Describe what you want
2. Generated schema
3. API response
4. Use in your app
Prompt
Copy
Extract product information from this e-commerce page:- Product title- Current price in USD- Original price if on sale- Star rating (out of 5)- Number of reviews- Availability status (in stock or out of stock)- Main product image URL
Goal: Create a content aggregation feed from multiple news sources.
1. Describe what you want
2. Build a content feed
Prompt
Copy
Extract article information from news websites:- Article headline- Author name- Publication date and time- Article category/section- Full article text- Featured image URL- Tags or keywords- Estimated reading time
Node.js
Copy
const express = require('express');const app = express();// Your custom news API endpointapp.get('/api/news/:category', async (req, res) => { const { category } = req.params; // URLs for different news sources const sources = [ `https://newssite1.com/${category}`, `https://newssite2.com/${category}`, `https://newssite3.com/${category}` ]; const articles = []; for (const url of sources) { const response = await fetch( 'https://app.manypi.com/api/scrape/YOUR_SCRAPER_ID', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.MANYPI_API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ url }) } ); const { data } = await response.json(); articles.push(data); } // Sort by publication date articles.sort((a, b) => new Date(b.publicationDate) - new Date(a.publicationDate) ); res.json({ articles });});app.listen(3000);
Good: “Extract product title, price in USD format, star rating out of 5, and boolean availability status”Bad: “Get product info”Specific prompts lead to more accurate schemas and better extraction results.
Test with multiple pages
Test your scraper on 3-5 different pages from the same site to ensure consistency:
Different product types
Pages with missing data (out of stock, no reviews)
Pages with special formatting
This helps catch edge cases before production use.
Handle missing data gracefully
Not all pages have all fields. Make optional fields nullable in your schema:
Respect robots.txt and terms of serviceAlways check a website’s robots.txt file and terms of service before scraping. Some sites explicitly prohibit automated access.
Rate limitingBe mindful of request frequency. Excessive requests can:
Get your IP blocked
Overload target servers
Consume credits quickly
Implement reasonable delays between requests (1-2 seconds minimum).
Dynamic contentManyPi handles JavaScript-rendered content automatically. However, some sites use advanced anti-bot measures. Contact support if you encounter issues with specific sites.