Here's a structured approach to generating fake product data for an e-commerce platform, along with sample data and implementation guidance:
Key Fields for Product Data
- Product ID (Unique identifier)
- Name (Descriptive product title)
- Category (Product classification)
- Price (Cost in USD)
- Stock Quantity (Inventory count)
- Description (Brief feature summary)
- Rating (Customer review score)
- Reviews Count (Number of customer reviews)
- Brand (Manufacturer name)
- Availability (In stock/out of stock)
Sample Fake Product Data (5 Products)
| ID | Name | Category | Price | Stock | Description | Rating | Reviews | Brand | Availability |
|---|---|---|---|---|---|---|---|---|---|
| 101 | Wireless Headphones Pro | Electronics | 99 | 150 | Noise-cancelling, 30hr battery, Bluetooth 5.0 | 5 | 324 | AudioTech | In Stock |
| 102 | Organic Cotton T-Shirt | Clothing | 99 | 200 | 100% organic, fair-trade, machine washable | 2 | 178 | EcoWear | In Stock |
| 103 | Yoga Mat Premium | Sports | 50 | 75 | Non-slip, 6mm thick, eco-friendly material | 7 | 412 | FitLife | In Stock |
| 104 | Smart Home Hub | Electronics | 99 | 0 | Voice control, compatible with Alexa/Google Home | 8 | 89 | HomeSmart | Out of Stock |
| 105 | Stainless Steel Water Bottle | Home | 99 | 300 | Insulated, 24hr cold/hot, leak-proof lid | 9 | 567 | AquaPure | In Stock |
Data Generation Guidelines
-
Realistic Constraints:
- Price: $10–$500 (common e-commerce range)
- Stock: 0–500 units (include 0 for "out of stock")
- Rating: 0.0–5.0 (1 decimal place)
- Reviews: 0–10,000 (higher for popular products)
-
Category Distribution:
categories = ["Electronics", "Clothing", "Sports", "Home", "Books", "Beauty"]
-
Brand Names:
Generate plausible names (e.g., "TechNova", "StyleCraft", "FitZone")
-
Description Patterns:
- Electronics: "High-performance, [feature], [benefit]."
- Clothing: "Comfortable [material], [design], [care instructions]."
- Home: "[Material] [item], [key feature], [size/dimensions]."
-
Availability Logic:
90% "In Stock", 10% "Out of Stock"
Python Implementation Snippet
import random
def generate_fake_product(id):
categories = ["Electronics", "Clothing", "Sports", "Home", "Books", "Beauty"]
brands = ["TechNova", "StyleCraft", "FitZone", "HomeSmart", "PureLife"]
product = {
"id": id,
"name": f"Premium {random.choice(['Wireless', 'Organic', 'Smart', 'Eco-Friendly'])} {random.choice(['Headphones', 'T-Shirt', 'Yoga Mat', 'Hub', 'Bottle'])}",
"category": random.choice(categories),
"price": round(random.uniform(10, 500), 2),
"stock": random.randint(0, 500),
"description": f"High-quality {random.choice(['durable', 'lightweight', 'waterproof'])} product with premium features.",
"rating": round(random.uniform(0, 5), 1),
"reviews": random.randint(0, 10000),
"brand": random.choice(brands),
"availability": "In Stock" if random.random() > 0.1 else "Out of Stock"
}
return product
products = [generate_fake_product(i) for i in range(1, 11)]
Data Quality Checks
- Uniqueness: Ensure no duplicate IDs
- Price Consistency: Verify 2 decimal places
- Stock Logic: Match "Out of Stock" with
stock = 0 - Rating Distribution: 70% of ratings should be ≥4.0 (simulate positive reviews)
- Category-Name Alignment: Ensure product names match categories (e.g., "Yoga Mat" in "Sports")
This approach creates realistic, diverse product data suitable for testing e-commerce platforms, dashboards, or analytics pipelines. Adjust parameters (e.g., price ranges, category weights) based on your specific use case.
Request an On-site Audit / Inquiry