A schema generated from one JSON sample is wrong from day one: it marks every field as required and locks each field to the exact type that one sample happened to contain. Real API data varies — fields are missing on some records, null on others. The fix is to infer from multiple samples and merge them: a field absent from any sample becomes optional, and a field whose type differs across samples becomes nullable or a union. This guide explains the optional / required / nullable distinction and walks through generating a schema that survives real data.
Merge several JSON samples into one schema, entirely in your browser:
Why one sample isn't enough
Auto-inference from a single JSON object is fast, but it makes two assumptions that are almost always wrong for real APIs:
- Every field is required. If your one sample includes
phone, the schema demandsphoneon every record — even though many records omit it. - Every type is fixed. If
deleted_atis a string in your sample, the schema rejects thenullthe API actually returns most of the time.
Both assumptions come from the same root cause: a single sample contains no variation, so the generator has nothing from which to infer what is optional or nullable. Collect a few real responses and merge them, and the variation tells the generator everything it was previously guessing.
Optional vs. required vs. nullable
These three words get used interchangeably, but they are independent properties — and confusing them is one of the most common schema bugs. Optional is about whether the key can be absent. Nullable is about whether the value can be null when the key is present.
| Field state | Key can be absent? | Value can be null? | In required? |
|---|---|---|---|
| Required, non-null | No | No | Yes |
| Required, nullable | No | Yes | Yes (type includes null) |
| Optional, non-null | Yes | No | No |
| Optional, nullable | Yes | Yes | No (type includes null) |
An optional-and-nullable field has three distinct states the schema must allow: present with a value, present but null, or absent entirely. A single-sample generator can only ever capture one of those three. Merging real samples is how you cover all of them.
How multi-sample merging works
The Multi-Sample Schema Generator infers a schema for each sample you paste, then folds them together with three rules:
- Required = intersection. A field stays in
requiredonly if it is present and non-null in every sample. Missing from one sample, ornullin one sample, and it becomes optional. - Types union per field. If a field is a string in one sample and
nullin another, the merged type is["string","null"]. Object-vs-null becomes ananyOfof the object schema and a null type. - It recurses. Nested objects and array items are merged with the same rules at every depth.
How to do it, step by step
- Collect 2–5 real samples that differ on purpose — a happy-path record, one with an optional field omitted, and one with a nullable field set to
null. - Open the Multi-Sample Schema Generator. It runs entirely in your browser; nothing is uploaded.
- Paste one sample per slot. Click Add sample for up to five slots. Invalid JSON is flagged on the specific slot.
- Click Merge into Schema. Optional and nullable fields are detected automatically.
- Review, tighten, and validate. Add the constraints inference can't know (
enum,minimum,maximum,pattern), then confirm it in the JSON Schema Validator.
A worked example
Three user records — note that phone appears in one, is missing in another, and is null in a third, while role is missing from the first:
// Sample 1
{ "id": 1, "name": "Alice", "email": "alice@example.com", "phone": "+1-202-555-0143", "active": true }
// Sample 2
{ "id": 2, "name": "Bob", "email": "bob@example.com", "active": false, "role": "admin" }
// Sample 3
{ "id": 3, "name": "Carol", "email": "carol@example.com", "phone": null, "active": true, "role": "editor" }
Merging all three produces a schema that reflects the real variation — phone is optional and nullable, role is optional, and only the fields present in every sample stay required:
{
"$schema": "https://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"id": { "type": "integer" },
"name": { "type": "string" },
"email": { "type": "string", "format": "email" },
"phone": { "type": ["string", "null"] },
"active": { "type": "boolean" },
"role": { "type": "string" }
},
"required": ["id", "name", "email", "active"]
}
Compare that to what a single-sample generator would emit from Sample 1 alone: phone would be a plain required string, and role would not exist in the schema at all. The first time production sends Bob (no phone) or Carol (phone: null), that schema rejects valid data.
The merged schema is still a starting point
Merging fixes structure — which fields exist, which are optional, which are nullable. It cannot infer your rules. A sample can't tell the generator that status must be one of three values, that age is between 0 and 120, or that email must match a pattern. After merging, tighten the schema:
"status": { "type": "string", "enum": ["active", "inactive", "pending"] },
"age": { "type": "integer", "minimum": 0, "maximum": 120 },
"role": { "type": "string", "enum": ["admin", "editor", "viewer"] }
Single-sample vs. multi-sample generators
| Capability | Single-sample generator | Multi-sample merge |
|---|---|---|
| Detects optional fields | No — every field required | Yes — required intersection |
| Detects nullable types | No — type locked to the sample | Yes — type union per field |
| Handles diverging shapes | No | Yes — anyOf across samples |
| Format hints (email, uuid, date-time) | Yes | Yes (preserved when samples agree) |
| Runs in the browser | Varies | Yes — nothing uploaded |
Frequently Asked Questions
Why is a JSON Schema generated from one sample wrong?
A single sample cannot show variation. The generator marks every field present in that one sample as required, and locks each field to the exact type it saw. Real data is not like that: some fields are missing on some records, and some come back null. A schema built from one sample rejects valid responses the moment production sends a record that differs from your lucky example.
How does merging multiple samples detect optional fields?
A field is kept in the required list only if it is present and non-null in every sample. If any sample omits the field, it drops out of required and becomes optional while remaining in properties. This is the intersection of the required fields across all samples.
How are nullable fields handled?
When a field is a string in one sample and null in another, the merger unions the types into a single type array such as ["string","null"]. When a field is an object in one sample and null in another, it becomes an anyOf of the object schema and a null type. Either way the resulting schema accepts both shapes.
What is the difference between optional, required, and nullable?
Optional means the key can be absent entirely. Nullable means the key is present but its value can be null. They are independent: a field can be required and non-null, required and nullable, optional and non-null, or optional and nullable. An optional-and-nullable field has three states — present with a value, present but null, or absent — and your schema must allow all three.
How many samples should I merge?
Two to five is usually enough. Choose samples that differ on purpose: include the happy path plus the edge cases you know exist — a record with optional fields omitted, one with nullable fields set to null, and any variant with a different shape. The merger can only learn variation it actually sees.
Is the generated schema ready for production?
It is a strong starting point, not a finished schema. Inference cannot know your business rules: allowed enum values, numeric minimums and maximums, string length limits, or regex patterns. Review the merged schema, add those constraints, and validate it against real data before relying on it.
Build a schema from your real data
Paste 2–5 samples and merge them into one draft-07 schema — optional and nullable fields detected automatically, all in your browser.
Have a few real API responses? Turn them into one accurate schema now.
Open the Multi-Sample Generator →