Overview
ManyPi uses JSON Schema to ensure your scraped data is always structured, validated, and type-safe. Define your schema once, and get guaranteed data consistency across all scrapes.Benefits:
- Catch data issues early with validation
- Generate TypeScript types automatically
- Ensure consistent data structure
- Document your API responses
JSON Schema basics
Every ManyPi scraper uses a JSON Schema to define the structure of extracted data.Simple example
- ✅
titleis always a string - ✅
priceis always a number - ✅
inStockis always a boolean - ✅
titleandpriceare always present - ✅
inStockis optional (not in required array)
Supported data types
Primitive types
- String
- Number
- Integer
- Boolean
- Null
Complex types
- Array
- Object
- Array of Objects
- Enum
Response
TypeScript integration
Generate TypeScript types from your JSON Schema for full type safety in your application.Using json-schema-to-typescript
1
Install the package
2
Convert schema to TypeScript
generate-types.ts
3
Generated TypeScript types
types/product.ts
4
Use in your application
app.ts
Real-world schemas
E-commerce product
Job listing
Article/Blog post
Validation in practice
Client-side validation
Use libraries like Ajv to validate responses:Runtime type checking with Zod
Best practices
Make optional fields nullable
Make optional fields nullable
Not all pages have all data. Use nullable types for optional fields:
Use enums for fixed values
Use enums for fixed values
When a field has a limited set of possible values, use enums:
Set reasonable constraints
Set reasonable constraints
Add validation rules to catch data issues:
Document your schema
Document your schema
Add descriptions to help future developers:
Generate types automatically
Generate types automatically
Don’t manually write types - generate them from your schema:
