Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema generation #246

Open
4 of 7 tasks
joolfe opened this issue Nov 26, 2022 · 2 comments
Open
4 of 7 tasks

Schema generation #246

joolfe opened this issue Nov 26, 2022 · 2 comments

Comments

@joolfe
Copy link
Owner

joolfe commented Nov 26, 2022

Schemas generation

Summary

The goal of this issue is to try to design a feature that includes the automatic generation of schema objects in the OpenAPI result, also this issue tries to group the different PRs and Issues about this topic as PR237, 183 and 22 and make people understand why this issue is more complicated than expected and why need time to be designed and develop, and time is something that I don't have a lot for an Open Source project, sorry.

Even if always appreciate this is not a request for help, I want to make you understand why this feature needs time, time to read specs, time to design correctly and time to develop and test.

Requirements/Decisions

These are the desired requirements and things to be decided when implementing this feature:

  • Should generate the schema for the Request and also for the Response.
  • Response schema will use response that is an array, so we need to decide if will generate multiple schemas or merged one (if supported) Multiple schemas
  • We need to take into consideration what JSON schema is used by OpenAPI 3.0, Considerations about the OpenAPI Json schema
  • If possible we can use a library for the task to translate the json to schema so we don´t need to worry about this code, Using a library
  • XML will not be supported, OpenAPI seems to support xml schemas but we will focus only on JSON.

Multiple schemas

For the Response generation we will use the postman object item[*].response that is an array so from an Operation we can have multiple generated schemas as a response, the strategy for that should be:

  • Take into consideration only responses whose status code is 200 or 201, or directly 2XX, we don´t want to use the error response schemas.
  • When multiple 2XX we can:
    • Try to do a merge of the schemas, probably too complicated and we cannot ensure good results but can be a toggle feature.
    • Use the first defined response, if Postman respect the order the customer can choose the schema he desires ensuring that the first schema is the more detailed one (need to test if postman respects the order)
    • Use a mark in one of the responses, responses have a name we can add something like the [required] mark to identify that this one is the chosen one.
    • If the schema support multiple definitions for a response then use this, so the response has multiple schemas.

Considerations about the OpenAPI JSON schema

We are going to focus on OpenAPI 3.0 as is our actual target (but also think that if we evolve the tool to support OpenAPI 3.1 this could change so we should implement it in a way that can be updated when this happens), in OpenAPI docs indicate that the used schema is "an extended subset of the JSON Schema Specification Wright Draft 00", so the OpenAPI schema has his rules on top of the JSON schema Draft 5.

After some reads and here and reviewing the Open API data model specs seems like the main difference between JSON schema and Open API schema is the support and change of some keywords but the main shape is the same, the supported and unsupported keywords are listed here

As summary:

  • Data types: only null is not supported by OpenAPI, in open API this is indicated with the keyword nullable
  • Multiple Types: In Open API we cannot use multiple type (an item cannot be type: ['string', 'number']) but can use the oneOf keywords with similar result but is not exactly an equivalence so we should take care of this.
  • Enum: is supported by both specs but anyway is very difficult to infer the enum using an example JSON so probably we are not going to support this.
  • Array types: Arrays are defined in the same way for basic shapes that is exactly what we are going to use, things like additionalItems or other complex schema keywords will not be inferred from a JSON object so...
  • Object Types: The most basic shape of an object is defined in the same way, complex structures or keywords like additionalProperties that are not exactly same meaning in OpenAPI will not be inferred from a JSON object so...
  • Unsupported keywords: unsupported keywords are not easy to be inferred using as base a JSON object (response, request example) so we are not going to have a problem with that as it will not be used.
  • Additional keywords: only nullable and example seems to be inferred from a JSON object example.

As summary, we can look for a library that creates a JSON schema from a JSON object example, but needs to avoid the unsupported keywords and also be careful with nullable, maybe extending to add example and multiple types. Creating the code by ourselves would depend on the result of the library analysis.

Using a library

TBD...

Points to review:

  • detect nulls
  • support multi types (oneOf)

TODO

  • Review what JSON schema version/dialect is used by OpenApi 3.0
  • Review if JSON schema used by OpenApi 3.0 has anything important that differs from Draft 5 (something that is going to be used by our library)
  • is there any way to distinguish between a free form object, dictionary and a object? I think we should just define object with fixed estructure...
  • Check if Postman respects the order of the responses defined in operation or because is an array we cannot trust on this.
  • Review if in OpenAPI 3.1 the JSON schema version/dialect has changed.
  • Review if the OpenAPI schema object support multiple definitions for a response, something like `This response can be a string or a number.
  • get a list of npm libraries that we can take into consideration.
@ryanackley
Copy link

From my perspective, I want to use this tool for commercial API's used in production by well-known companies that don't produce an OpenAPI spec. For example, Zendesk, Quickbooks, OpenAI. I don't see it as a generic tool to retrofit random legacy API's.

In that context, the schemas are usually very well formed so they can be easily explained and documented to customers. You don't see free form objects and dictionaries (basically user-defined data shapes). Multiple response definitions for the same endpoint are another extreme edge case. Yes it can support it, see anyOf,oneOf, etc.. but it's very uncommon because it's hard to explain to integrators.

@suzy-g38
Copy link

Hey, is this issue open?...I had faced the problem of generating schemas for both request and response while I was working in a project and wrote some additional code to fix it. If you guys still want to add the automated schema generating feature I would love to do that. @joolfe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants