Salta al contenuto principale

API Guidance: An overview of API technologies Part 2

Application Programming Interfaces (APIs) are an integral part of modern distributed systems – and an essential link in the data supply chain for modern, data-driven organisations. They are, thus, indispensable for the data sharing ecosystem that the Support Centre for Data Sharing seeks to promote. In this three part series – accompanied by eLearning modules – we want to provide concentrated and practical knowledge to those who are thinking about developing and deploying an API, and thus need to understand the basics of this technology, as well as for those who are looking for more information on what to consider in the implementation process.

This content will be published in two ways:

  1. Here, as episodic website-content, with a new instalment every month, and
  2. after the series’ completion, as a single document that you can download and read offline.

The guide is available in English, French and German. Each instalment is followed by eLearning modules, covering their respective topics – and helping you to test your knowledge.

In the first part of this series, we already covered the basics of APIs. This second instalment will cover:

  • API types and their practical implications
  • Application scenarios
  • API Documentation

The third instalment will discuss the implementation guidance for APIs and will focus on:

  • RESTful interfaces
  • GraphQL interfaces
  • Security design
  • API documentation

Stay tuned and don’t forget to discuss the materials by commenting on each instalment, or with your peers in our forum.  

 

4. API types and their practical implications

There are various types of APIs to help microservices and Self-Contained Systems to communicate with each other. The following section explores five of the most important API types: SOAP, RESTful interfaces, GraphQL, streams / events, and weblinks.

 

4.1 SOAP

The “Simple Object Access Protocol” (SOAP or WS-SOAP) is a standardized protocol1 for RPCs. RPCs trigger an operation between two systems, using the HTTP(S) protocol and the XML data representation. HTTP(S) serves as the underlying transport protocol to access the entry point of a 

SOAP-service. The message content to actually implement the RPC is contained in standardized XML data.

SOAP is still widely used, mainly because it is compatible with the framework-specific interoperability requirements of existing applications that follow the “Service-oriented Architecture” (SOA) approach. It is also perceived as beneficial thanks to the huge SOAP-ecosystem of technical components and implementation guidelines. These include, for example, security services and mandatory technology policies that projects should consider.

XML and SOAP are still very present today, implemented in various existing or legacy interfaces and implementations. But they are also heavyweight technologies that are increasingly replaced by applications employing a RESTful or microservice approach. This API guide focusses on more modern approaches and, therefore, does not explore SOAP in greater detail.

 

4.2 RESTful interfaces

RESTful interfaces apply the Representational State Transfer (REST) architecture to enable a client (i.e. the requesting system) to access and to manipulate the textual representations of web resources through predefined, uniform and stateless operations. REST relies on hypermedia linking, i.e. the linking to other content such as videos, audio, and text via hyperlinks. Although the paradigm does not depend on a specific transport protocol, RESTful webservices frequently use the fixed operations of the HTTP(S) protocol.

In practice, a RESTful link points to a single or set of hypermedia resources, e.g. books in a bookstore:

https://my-bookstore.com/store/books

https://my-bookstore.com/store/books/1

The first link returns a list of all available books, while the second link refers to one specific book. But, as mentioned before, RESTful interfaces do not just use URLs to point to entry points or resources, but also to trigger additional functions:

https://my-bookstore.com/store/books/1/numberInStock

This link would call a function that counts how many copies of a specific book are in stock. In the same manner, links can be extended to fulfil other parameters such as filters or queries for a specific author:

https://my-bookstore.com/store/books?author=Doyle

In combination with the HTTP protocol, these links can define the operation to be completed on a resource, e.g.:

GET https://my-bookstore.com/store/books (to retrieve a list of books)

DELETE https://my-bookstore.com/store/books/1 (to delete a specific book)

POST https://my-bookstore.com/store/books/1 (to create a book in a certain list)

When a book is posted to a list, the details about the new book are given as a payload in the body of the POST request, using the JSON format:

{

                                           title: “4.50 From Paddington. (Miss Marple)”

                                           author: “Agatha Christie”

                                           isbn: “978-0007120826”

}

In the case of the GET query, the list of books is contained as the payload in the body of the GET response. An HTTP method call is complemented by metadata sent in the request as well as given back in the response. Such meta information could be the representation format that is supposed to be used or given, e.g. JSON

Content-Type: application/json

In addition to these elements, each HTTP response includes a status code. The HTTP status codes are mostly standardized, e.g.

200 OK

So, in summary, a basic REST interface covers these five main elements:

  1. URL schema, including parameters
  2. HTTP methods
  3. HTTP metadata
  4. Payload defined by a representation schema
  5. Status codes

Further aspects and advice on how to build a good REST interface can be found in numerous guides, e.g. the “Hypermedia As The Engine Of Application State” (HATEOS)2 rules. A complete interface specification of a service should be defined using a formal description like the standard proposal for OpenAPI 33. The formal description and documentation of APIs will be explained in section 6.

 

4.3 GraphQL

While WS-SOAP and WS-REST are generic approaches for exchanging data and communicating in distributed systems, GraphQL can help to retrieve and modify data in a more flexible and – depending on the use case - more effective way. Solutions based on WS-SOAP and WS-REST can lead to ineffective behaviour in some cases. Most commonly, this ineffectiveness occurs in the form of long round trips as well as under- and overfetching of data. Long round-trips describe a situation where it simply takes a (relatively) long time until the circle from query to response, i.e. from client to server and back to the client, is completed. Under- and overfetching of data, instead, describe situations where APIs provide either too little or too much data – which can occur in parallel. For example, imagine if an API user wanted to show on their website a list of books with book titles and the respective authors, but the books and authors are stored in different resources and accessible via separate API endpoints ( “https:…/store/books” and “https:…/store/authors”). The user would first request the list of books from the respective endpoint, but get too much information, like ISBNs (overfetching). Then, the user would also need to call the other endpoint just to request the full author names (underfetching).4 To work around these problems, API providers can optimize their APIs by providing methods and workarounds in the interface that suit specific use cases. But usually there are simply too many use cases, which can also change rapidly due to new or changing business cases. Hence, the workaround approach is neither flexible enough nor cost effective. Against this background, Facebook developed the GraphQL API, allowing especially user interfaces to be adapted fast and flexibly without adapting an API too often.

GraphQL uses HTTP(S) to access an entry point. The GraphQL entry point allows to retrieve the published data using a simple query language. First, providers of a GraphQL API need to provide a global data schema, including all entities and meta information. This could look as follows:

type Author {

                                           name: String

                                           address: String

}

 

type Book {

                                           title: String

                                           isbn: String

                                           authors: [Author]

}

 

type Query {

                                           books: [Book]

                                           book(title: String): Book

}

A GraphQL service with this schema will provide a list with all books and its authors using the query:

{ books { title, authors { name } } }

It will also show the details of a specific book by using the following query:

                             { book(“4.50 From Paddington. (Miss Marple)”) { title, isbn, authors { name } } }

The GraphQL schema defines the maximum view on the data a client can access through a GraphQL entry point. Not all fields in the schema must be real data fields in the underlying data storage. For example, a field could be implemented internally by the server as a function, e.g. “numberOfBooks”. By providing the whole available data to the client through an API, a client can be adapted fast and very flexible to the data view the client really needs for a use case. Changes to the API are only required, if the data schema changes. Even in this case, a client must only be changed, if it uses the altered fields.

A GraphQL entry point for a certain service will be implemented as a GraphQL Server. The GraphQL Server knows the given data schema and evaluates a GraphQL query. The GraphQL Server usually does not manage the underlying data itself. Instead, it acts more like a proxy, mapping the queries effectively and efficiently to an existing database management system or other source systems. Besides querying the data, GraphQL allows also to modify the data and to subscribe to streams of messages, enabling data users to be informed about events immediately.

Usually there is only one entry point for GraphQL. But for reasons like data protection, a service could have entry points for different roles. To manage access, these entry points would restrict access to certain schema entities for different users, thus limiting their ability to retrieve data depending on their role.

To ease the development of clients, GraphQL provides a standard implementation of a web-based playground, which can be easily integrated in a GraphQL Server.5 The playground allows to interactively test GraphQL queries against the provided data.

 

4.4 Streams / Events

Simply processing tasks in a sequential and transactional way does not enable modern applications to be reactive. Reactive means that an application must be responsive, elastic, resilient and message driven. According to the Reactive Manifesto, reactive applications work better in modern distributed systems that are shaped by a wide variety of different servers or machines, from mobile phones to large scale servers, driven by demanding user expectations for lowest millisecond response times and 100% uptime6.

Figure 2: Asynchronous messaging using an Event Bus and queued Streams

Figure 2: Asynchronous messaging using an Event Bus and queued Streams

New architectural designs, such as event-driven-systems, aim to respond to these requirements. In addition, paradigms such as the “Command-Query-Responsibility-Segregation” (CQRS) enable distributed systems to communicate via asynchronous messages. Instead of waiting with any further processing until the client has received a response from the server, asynchronous messages allow a client to just announce an event to all interested listeners, that are currently available. Such messages can be queued in a stream, guaranteeing that one or all subscribers read each message – at least at their time. Crucially, this enables a far more effective distribution of tasks to multiple workers. Similar to the payloads of previously described APIs, events and streams of messages transport information. Besides using a common service for all parties with a well-defined API for the underlying transport protocol, it is therefore important that the format of the messages and the subjects/topics of the specific channels are well-defined and documented.

 

4.5 Weblinks

While differing back-end technologies are often well structured into microservices and containers, the division of labour for the front-end can be just as challenging. Just as the back-end, front-end frameworks – even recent ones – can evoke new monoliths.  The technical solution is to break up the GUI and have reusable, polyglot components, e.g. using web components and building Micro Frontends7. When you look back to figure 1, independent of a universal implementation it seems common sense to split the GUI into fractions, especially to implement Verticalized Systems, like the outlined Self-Contained Systems. In practice, this view is however just gaining foothold.

To integrate polyglot GUI systems, it is reasonable to use weblinks to switch between GUI parts (e.g. Micro Frontends) inside the overall GUI frontend. But these weblinks are not only a pointer to an entry point. They also carry context information to activate an entry point, editing the information about a specific item with a given identifier. Whatever GUI framework is used, there should be no difficulty to map each public weblink to the specific component code using the internal HTTP router of the applicable GUI code.

Therefore, these links can be seen as their own kind of API. As with back-end APIs, the weblink-based API for a GUI component must include all entry points (HTTP addresses) and all parameters that can be contained in the links.

eLearning module for Chapter 4

 

5. Application scenarios

Of course, the choice for a certain API type should be guided by the requirements of a specific use case - but in practice, that is easier said than done. Usually, there is more than one API technology that can be used – and often, identifying the best solution is difficult. If a service provides a GUI to be used by third parties, the provider should describe the Weblinks of these GUI components in detail in an API.

WS-SOAP and WS-REST are both technologies to exchange data and RPC in distributed systems. Because changes to these API types are critical and trigger complicated adaptations throughout the respective ecosystems, these technologies should be used for “stable” business use cases with business logics that are fixed over a long period of time, e.g. regular bulk data transfers. As WS-REST APIs are more lightweight, they should be used for new APIs, whenever a system is extended, as well as for internal communications of systems. Usage of WS-SOAP should be reserved for legacy APIs between independent organizations; but even then, WS-SOAP will probably continue to exist for some time, especially if used in standards.

If data must be provided in a flexible, agile way, specifically for GUI components, GraphQL should be the first choice. All asynchronous communication to implement Event-Driven Architectures, Event Sourcing, and CQRS should use events and streams, especially for internal use cases. For public APIs, providers should carefully decide, if the publish/subscribe mechanism of GraphQL fulfils their use cases or whether the more advanced events and streams are more appropriate.

eLearning module for Chapter 5

 

6. Documentation

APIs are the connecting link between a service and its users. The best service implementation, using the most advanced technologies and creative solutions, will ultimately fail, if the documentation of its APIs is poor-quality and careless. A user, i.e. a third-party programmer, must be given the opportunity to be able to understand the API well to use it successfully. Therefore, a good, comprehensive and understandable documentation is an essential part for every API – just as important as the robust and solid implementation of the service itself.

For this, the documentation of the source code and a documentation of the comments in the source code are not sufficient. As public APIs are used to connect distributed services written by several parties, the programming languages used by their tools can – and likely will – differ. Hence, the API must support a polyglot environment and, therefore, should be specified in a neutral form, be formalized, and use a unique semantic. This can be guaranteed by using schema definition languages like OpenAPI.8 This interface description language is most suitable for REST APIs, but can also be used for Weblinks, events/streams and GraphQL. An OpenAPI-based interface description can also be used to generate user-friendly documentation as well as code for server and client implementations. This can help to generate consistent implementation and documentation of the API.

Beyond the syntactical, informal description of the API, there should be an explanation of its behaviour. This can be achieved by providing standardized models/diagrams, e.g. using the “Business Process Model and Notation”9 (BPMN) or sequence diagrams of the “Unified Modeling Language”10 (UML). Another important description of the behaviour should be implemented by a comprehensive set of test cases, e.g. using a behavioural-driven development approach for API testing.11

 

eLearning module for Chapter 6

 

 

Stay tuned and don’t forget to discuss the materials by commenting on each instalment, or with your peers in our forum. Also, test your knowledge in the SCDS API Guidance eLearning modules: https://elearningcourses.eudatasharing.eu/en/apiguidance/#/

API Guidance: An overview of API technologies Part 2
credito d'immagine:
(C) 2020 Support Centre for Data Sharing