API guidance: Implementation guidance for APIs - Part 3

Application Programming Interfaces (APIs) are an integral part of modern distributed systems – and an essential link in the data supply chain for modern, data-driven organisations. They are, thus, indispensable for the data sharing ecosystem that the Support Centre for Data Sharing seeks to promote. In this three part series – accompanied by eLearning modules – we want to provide concentrated and practical knowledge to those who are thinking about developing and deploying an API, and thus need to understand the basics of this technology, as well as for those who are looking for more information on what to consider in the implementation process.

This content will be published in two ways:

  1. Here, as episodic website-content, with a new instalment every month, and
  2. after the series’ completion, as a single document that you can download and read offline.

The guide is available in English, French and German. Each instalment is followed by e-learning modules, covering their respective topics – and helping you to test your knowledge.

The second instalment gave an overview of recent API types. The third instalment will discuss the implementation guidance for APIs and will focus on:

  • RESTful interfaces
  • GraphQL interfaces
  • Security design
  • API documentation

Stay tuned and don’t forget to discuss the materials by commenting on each instalment, or with your peers in our forum.

7. RESTful interfaces

RESTful interfaces are useful to transfer bulk data and when your service interface as well as respective use cases are fixed.

If you start to design an API, you should not derive the API from your application’s code, e.g. Java classes. Instead, take a look at the bounded context of the domain and design the API around the core entities of the domain and the required functionality. Specifically, a REST interface focuses on resources. Resources can be objects, tables, documents and many more, e.g. the entities of your core domain; in short: any data you like to publish. So, if you design a REST API, you should adopt a hypermedia way of thinking because linking different resources is the underlying idea of RESTful interfaces. Also keep in mind that REST interfaces are primarily used for communication between services and self-contained systems, but are less common for internal usage.

7.1 Basics and structure

If you design a REST API, you have to take care of the four basic concepts: HTTP(S) protocol for transportation, URLs for addressing resources, the data representation format, and the predefined methods for HTTP.

RESTful-based services should use HTTPS/2 1 as this offers a more secure transportation protocol.

In any case, the URLs should also abide by the following basic pattern to address resource sets, specific resources and calling methods on resources:

/{version}/{collection}/{resource id}/{method name}?{query parameters}

Collections should be named in plural and resources must have a unique identifier. Identifiers should use the camel case notation, e.g. “/v10/myFavoredCollection”.

The data representation format used in the body of a HTTP method should be JSON or YAML. The structure of the data should be specified using a well-known schema definition language, e.g. OpenAPIv3 (see section 6 on API documentation). Additionally, the HTTP headers “Content-Type” and “Content-Language” must be correctly set. The user of an API should be able to select the preferred format using the HTTP header “Accept”. Each resource should contain an attribute “id”, which is the unique identifier of the specific resource. Resources should have a structured format, to include the complete information of the resource.

The GET method is used to request/search specific resources or a list of resources from a collection. Search parameters should be defined to support limiting the number of results and support paging by evaluating the query parameters “skip” and “limit”. The POST method is used to create new resources, while the PUT method is used to update/modify resources. The DELETE method is used to remove specific resources.

In general, all requests and responses should support HTTP headers to transfer required or useful meta information, e.g. for searching “Results-Matching”, “Results-Skipped” and “Link”.

7.2 Recommendations

Because APIs can evolve over time, it is important to maintain a versioning of the API releases. To help users of an API estimate the impact of a new release, the API release number must be unambiguous, e.g. by adopting the rules of the Semantic Version Specification 2 (SemVer). An API should support the current release and the prior release for an interim period. The release number should be part of each URL, e.g. https://api.expample.org/v1/...; alternatively, it could be implemented according to the hypermedia paradigm with HTTP headers, e.g. Accept: application/vnd.example ;version=1.0 .

The results of a GET request should also be cacheable. To achieve this, the response must include the correct HTTP headers for ETag, Last-Modified, If-None-Match and If-Not-Modified-Since. If there are too many requests in a timeframe, the request should be rejected with an appropriate error message.

In general, each HTTP request should be responded to with a correct HTTP status code. Avoid using your own status code and instead use one of the predefined HTTP status codes.

Besides the business functions, an API should provide endpoints for the internal operations management. Each API should provide a health, liveness and readiness check status, e.g. under the URLs “/healthz”, “/healthz/liveness” and “/healtz/readiness”. Liveness means the service is basically available, while readiness means that the service accepts new requests right now and all services used by this service are ready, too.

Furthermore, it is recommended that the service provides metrics for internal monitoring, e.g. under the URL “/metrics”. For this, the metrics could use the textual exposition format from Prometheus to request indicator values. Indicators should include items such as usage, errors, and performance. To debug services, a unique tracing id should be added to the headers, e.g. “Tracing”. The tracing ID should be unique, forwarded automatically to other services that the API communicates with, added to each call (if not present), and be included in the log messages.

7.3 Useful links

eLearning module for chapter 7

8. GraphQL interfaces

GraphQL interfaces are useful, if you want provide data in a more flexible manner. This is particularly useful if not all use cases are known in advance or if they are changing frequently; an example would be a user interface that is served by an API and for which the required data changes often. GraphQL allows you to provide specific views on the data in your backend, based on a complex data model that requires only one entry point.

Depending on your use case, a GraphQL interface can also be deployed together with a RESTful interface.

8.1 Basics and structure

If you design a GraphQL API you have to take care of the four basic concepts – just as in the case of RESTful interfaces: HTTP(S) protocol for transportation, URLs for the GraphQL entry points, the data model(s), and the pre-defined methods for HTTP.

GraphQL-based services should use HTTPS/2 as transportation protocol for security reasons.

The GraphQL entry point URL should look like this:

/{version}/gql/{user role}

As stated above, a single GraphQL API is in principle able to give uniform access to an entire database – provided a suitable data model and query specifications have been implemented. However, in practice, it is nevertheless recommended to use multiple entry points, ideally one per user role. Ironically, the reason for this recommendation is precisely that GraphQL APIs would be capable of serving many different users with many different data requests. This may appear odd at first, because the supposed advantages of GraphQL APIs are their technical prowess and flexibility. But the same features pose security challenges, if a generic entry point is used by many different users with different roles, requesting data in an uncontrolled manner from the full database.

As a matter of good practice, APIs should be implemented in line with a single, unique security policy. The best way to regulate this is, in the case of a GraphQL API, to offer different, role-based entry points for different users. These entry point should be implemented in a consistent manner from a “Policy Enforcement Point”, i.e. they should validate the input data syntactically (against a schema) and check the authentication and authorization of the user for each request. The authorization in particular should be based on “Role Based Access Control” (RBAC). If each role is related to a specific data model, it should immediately be obvious what kind of information a user with a specific role can view or modify.

The data model for each entry point should be comprehensive and well-structured for the related user role. The data models should, at the minimum, be specified using the schema definition language (SDL) of GraphQL; ideally the data models would use OpenAPIv3 (see section API documentation) and convert the required SDL file from the OpenAPIv3 specification file. 3 , 4

To export, exchange and query data, the GraphQL API must provide the “query” operation. The “mutation” operation should not be provided in this context. The API can provide the “subscription” operation, to inform subscribed users if data is updated.

8.2 Recommendations

Especially for external APIs that exchange data with third parties, an integrated development environment (IDE), like GraphiQL, should be available as part of the API. This allows users to explore the data space and test GraphQL expressions. It should also help to speed up the development of new scenarios or the onboarding of new partners.

Because an API can easily evolve over time, it is important to have a versioning of the API releases. To help the users of an API estimate the impact of a new release, the API release number must be unambiguous and must follow the rules of the Semantic Version Specification 5 (SemVer). An API should support the current release and the prior release for an interim period. The release number should be part of each URL, e.g. https://api.expample.org/v1/gql/...; alternatively, it could be implemented according to the hypermedia paradigm with HTTP headers, e.g. Accept: application/vnd.example ;version=1.0 . As GraphQL itself is very flexible to changes, especially regarding extensions of the data model, only major releases need to be indicated.

The results of a GraphQL request should be cacheable. Therefore, the identifier of the resources in the data model must be unique

{ "starship": { "id": "3003", "name": "Imperial shuttle" } }

The GraphQL API entry point should also support a health, liveness and readiness check, e.g. under the URLs “/healthz”, “/healthz/liveness” and “/healtz/readiness”. Liveness means the service is basically available, while readiness means that the service accepts new requests right now and all services used by this service are ready, too. The GraphQL interface should also provide usage. For debugging purposes, the usage and forwarding of a tracing id should be implemented.

8.3 Useful links

GraphQL – A query language for your API, https://graphql.org

GraphQL or Bust, https://nordicapis.com/api-ebooks/graphql-or-bust/

GraphiQL – GraphQL IDE Monorepo, https://github.com/graphql/graphiql

GraphQL Playground, https://github.com/prisma/graphql-playground

9. Security design

Security considerations should play an integral part in the API design from the very start. Generally, an implementation should always be safe and secure by design to avoid unintended behaviour or attacks enabled by failures. The implementation should assure a valid use of the interface, e.g. by implementing a strong input validation. Furthermore, the implementation should avoid common failures such as buffer overflows. To analyse problems and security problems, all relevant events and errors should be logged in detail in an audit log at any time.

9.1 Communication

We strongly encourage that all traffic is secured for data privacy and security reasons. For this, transport layer security should be used for both external and internal communication. Therefore HTTP(S)/2 should be used at the transport layer. Accessing a service with HTTP or old encryption protocols, like TLS 1.2 and earlier, should be rejected. Security actions like implementing the HTTP Strict Transport Security Protocol (HSTS) and Perfect Forward Secrecy (PFS) should be supported.

If there are too many calls in a certain period, the service should decline the request temporarily and signal this by returning a status code, e.g. “429 Too Many Requests”. To avoid such problems the service could consider limiting the request rate by signalling rate limits, e.g. by using the headers X-Rate-Limit-Limit, X-Rate-Limit-Remaining und X-Rate-Limit-Reset in a response.

9.2 Authentication and authorization

Except for data published as Open Data, each user should be authenticated to access a service. Known users should also be able to authenticate themselves. The authentication should be implemented using common mechanisms, like OAuth2 or OpenID. If a service communicates with another service, the service should also authenticate itself, e.g. using a regularly renewed API-Token based on a secret management system like Vault. 6

Authorization to access a function of a service should be based on a “Role-based access control” (RBAC) security concept. For each role it should be comprehensible, which information the users of a certain group can view as well as which rights they have.

The authorization should be implemented by a bearer token, like a signed JSON-Web-Token 7 (JWT), in the HTTP header of a request. The claims of the token should at least include the issuer (iss), the subject (sub), the audience (aud), and the expiration time (exp).

10. API documentation

From the user perspective, high quality documentation is an essential part of any API. To ensure that an API is useable by external parties, its documentation must be understandable, correct, up-to-date, comprehensive and complete. Furthermore, the documentation should be public and available free of charge. Only a good documentation together with good design and sound implementation will provide for a successful API.

The API should be defined “first” from the use cases derived from the Domain-Driven Design and business processes. While implementing the API, the experiences should evolve and be used to improve the specification step-wise.

10.1 Implementation basics

The documentation and the interface code should be generated from the formal specification of the API. Therefore, the specification should be defined using the schema and interface definition language (SDL) OpenAPIv3. 8 This SDL should be used to specify RESTful APIs, GraphQL APIs, Weblinks and Streams/Events.

The specification should define server(s), security mechanisms, entry points, and all data models.

Each item of the specification must be explained comprehensively and precisely. Fields should be defined as precisely as possible, e.g. providing the “format” attribute for all fields and describing custom string values using regular expressions (regex):

{ “name”: “color”, “type”: “string”, “format”: “’red’ | ‘blue’ | ‘green’” }

The description fields should be used to describe the purpose, functionality, usage and context of the API itself and all its items. The description fields should contain structured or formatted text by using the markdown syntax.

All HTTP status that can be returned codes must also be listed and explained.

10.2 Recommendations

Any documentation should include use cases, domain stories, 9 sequence diagrams, code examples, and other figures to explain the usage and functionality of the API.

As APIs evolve over time, the documentation must include a history with changes for each release.

The documentation should be freely available online as a web page, as a printable e-paper (pdf), and should be linked from a well-known location. The documentation should be automatically generated and updated using a CI/CD process chain.

Lastly, documentation should be supplemented by test cases in a readable and executable form, e.g. using behaviour driven testing. 10

This completes the API Guidance series. Don’t forget to discuss the materials by commenting on each instalment, or with your peers in our forum. Also, test your knowledge in the SCDS API Guidance eLearning modules.

API guidance: implementation guidance for APIs - Part 3
Image credit:
(C) 2020 Support Centre for Data Sharing

For questions and comments, please visit our forum on Futurium.