1. Introduction
This chapter has the purpose to give an introduction what Links and Nodes is in general, and what specific problems it does address.
It also gives an outline which are the central components of Links and Nodes, explaining important terms, and finally gives an overview which information can be found in the following chapters, and where one might to proceed reading depending on their level of prior knowledge and need for detail.
1.1. What is “Links and Nodes”?
Links and Nodes (which we usually abbreviate as LN) is a middleware to create and manage flexible distributed real-time systems. It was created to develop and control embedded robotic systems.
So, what challenges does Links and Nodes address? Why should one undergo the trouble to use it?
1.1.1. What makes it difficult to set up distributed robotic systems?
There are a whole number of challenges in setting up experimental robotic systems, which Links and Nodes addresses. The most important ones are:
Such systems consist of many different parts, often written by different people in different programming languages. Typically, the parts are divided into processes. Processes are isolated from each other, in that each one has its own memory address space, and cannot access memory from other processes. This so-called memory protection is intentional: They are easier to develop and debug this way. Somehow, the processes need to communicate.
It is possible to write low-level communication routines, using OS facilities such as System V shared memory communication, or pthreads. But programming these is difficult and error-prone if one is not very careful. For example it is easy to create race conditions, which are often very difficult to debug because they lead to non-deterministic errors and are, depending on the programming language, a form of undefined behavior. And spending time on debugging does not provide very much value to what the robotic system should achieve or show.
If one uses low-level OS facilities for communication, setting up such a system requires a lot of repetitive code which is hard to oversee. They are also not well-supported in many programming languages which e.g. specialize on numerics or on ease-of-use.
As configurations are also quickly becoming quite complex, it becomes difficult to understand how a system is configured in detail.
When many people work together, everyone will use their individual user environment configuration. Usually, there are multiple ways of altering a configuration and multiple places where changes can be stored. With the number of shared and re-used components, the number of participants also grows quickly. While convenient for single users, this can make it very difficult to provide a clearly-defined working environment. And this can make it also difficult to achieve that the system behaves in a deterministic and reproducible way - what works today might not work tomorrow, or for a different user.
Hardware and how it is connected via networks changes according to tasks and over time. This means that manual configurations need to be adapted to which nodes run in a network, how they are connected, and so on.
Processes which have real-time requirements need to have priorities assigned and managed
It is also often necessary that one process is up and running before other processes which depend on them can start working. We say, the second process depends in the first one.
1.1.2. Features and design principles of LN
The following are some features of Links and Nodes which address these challenges and make it suited to robotic applications:
1.1.2.1. Main features and principles
Links and Nodes
displays and controls process state via a GUI or a command line interface.
uses a single configuration source to organize a system, which is in text form and can be managed by version control.
supports real-time operation. For example, it can transmit messages between processes on the same computer nearly instantly (within microseconds).
requires no knowledge of threads programming to use it.
supports different programming languages and environments. Currently, they include C, C++, Python2, Python3, and Matlab/Simulink.
especially, it isolates processes from the particular user environment, so that different personal configurations are less likely to break a setup.
aims to provide a fully reproducible setup.
makes process management and communication network-transparent, which means that it makes no logical difference where some of the processes run.
How all of these principles are implemented and how these features are used goes beyond the scope of this introduction. They will be discussed later in the User Guide part.
1.1.2.2. Overview on the functions provided by Links and Nodes
LN provides two main functions:
Process management: This allows to control process and their inter-dependencies, for example starting and stopping them in a defined order, handling errors, looking at their output, assigning scheduling priorities, sending signals, and so on. The result is similar to a kind of init system for a robotic system, similar to what systemd does on a general Unix system, but much more flexible, because it can be controlled interactively.
There are several central aspects which Links and Nodes takes care of:
- Process dependencies: For example, it allows to start processes
which depend on each other, and then one can modify the source code of one of these processes, re-compile and restart them, and other processes which depend on it will be restarted as needed. In this way, it supports flexible development of robotic processes.
- Priorities: In embedded real-time systems, it matters which
process proceeds if there are several tasks which one CPUs can run at the same time. For example, one process might contain a control loop which monitors basic safety parameters, and another one might check the battery state of a robot and initiate actions if the battery runs low. Priorities define which processes are more important in the sense that the CPU gets quicker to work on them.
- Nodes: A distributed system often has several nodes which cooperate
to do some work. They could be different computers with different CPUs and operating systems. The process management knows which system should run which process, and how to access and start them, so that each part works together like a single computer.
Communication: LN provides an easy way for these processes to work together by sending each other defined pieces of data. For this, it primarily uses a message-passing approach: One process prepares such a piece of data, and when it is ready with that preparation, it calls a LN library routine which transmits that data to the recipient process (or a number of recipient processes). The recipient processes can use another library routine to read this data once it has arrived. This is a widely used, quite easy and safe paradigm: Programmers which use the library routines do not need to take care of multi-threading, locking, deadlocks, race conditions, and many things more which make multi-threaded programs often very difficult to write and debug.
Communication patterns supported by Links and Nodes are of several different types. The two most important ones are:
Topics: Topics implement a publisher/subscriber pattern. This means, a program can make a piece of data available, which is regularly updated (usually, as step in a cyclic process). The data is attached to a name, which other programs can refer to to access that data. Each data label can be written to only by one program, but many other programs can read from it, so it can be used for broadcasting data. The data transmission is highly efficient and suitable for real-time applications. The transmission can be either reliable (which means any failures are corrected), or unreliable (which might have a speed advantage).
For topics, the data flow is unidirectional, data goes always from the sender to the receiver.
Services: Services implement a more generalized form of a function call. With a normal function call, some code is called within a program, which gets passed some arguments, and returns a return value. Usually, the code of the function runs on the same node or computer as the calling code. Services however implement a form of remote procedure call, which means that the call can run code on another node in the system, by transmitting the function arguments, running the computation, and transmitting the result back. This can also be very useful if the code runs on the same node, but in another process, because the calling process and the callee are separated and the programmer does not need to bother with threading issues.
(This is not an exhaustive list. LN provides additional services, which are explained later in the chapter User Guide.)
1.1.2.3. Main components of Links and Nodes Applications
Distributed systems which use Links and Nodes consist of two three components and a fourth one which usually works quietly in the background:
The LN Manager: The LN Manager is the central program which manages processes, knows how to start them, their dependencies, whether they are running, and so on.
It is configured by a central configuration file in text form. Each distributed system runs one LN Manager which is started for a specific project or system (but several LN managers for different systems can run on the same hosts).
Normally, the LN manager presents as a GUI program which displays processes and allows to control them. It also has a powerful command line interface.
LN Clients: LN Clients are processes which use the LN Client library to communicate and exchange messages.
Typically, they are managed by the LN Manager (however the LN manager can manage any kind of process, not just LN clients). In order to communicate, they use the client library API for the language in which the client is written.
Message Definitions:
The LN clients which communicate need to have a common understanding about the data which they exchange, even across different hosts, architectures, and programming languages.
To achieve that, messages are defined as plain text files, and they are identified when a specific client is started and wants to communicate with other clients. Within each client programming language, the message definitions are represented as simple packets of structured data, for example as a
structin C and C++, or as members of a class instance in Python. The translation from elements of a message definition to such data elements happens automatically and in an efficient way.LN daemon The LN daemon is a program which runs in the background. Its purpose is to serve as a hub for exchanging messages and setting up connections and communication.
Normally, it is automatically started on each host on which LN clients are used, and the LN Manager takes care of that. Unless there are errors or difficulties with a setup, you would not need to worry about this.
1.2. Quick Overview on the documentation, and where to find what
This section gives an overview on the documentation.
For readers who just want to jump into it, there is a Quickstart section which just gives an essential example with as little code as possible, and (depending on the background you have) should be sufficient to understand main concepts, run the examples, and experiment with them. These examples use Python and C++.
As a somewhat deeper introduction, we provide a Tutorials for Python and C++. Where it is convenient, it refers to the quickstart example. However, it takes care to not require prior knowledge, and explains in a step-by-step manner how the code fits together, how main components work, and what are important points to observe. They explain also what are the central concepts behind the presented functions, and point to where further documentation to these basic functions can be found.
The main part is the (not yet finished) User Guide, which consists of two sub-parts: The first explains the important concepts in more detail. This includes the basic architecture of the system, the components and how they relate to each other. The last chapter is an explanation of how to use the APIs of the LN client library and its various language bindings, which support Python, C++, and C. It has also an extended section on debugging LN clients, which might be useful for programmers not familiar with the techniques.
The User Guide is then followed by the Reference part, which is a programming reference which describes each API function and component in detail. Because the other languages bindings are based on C, the C language bindings will provide the most detail; these are linked in each Python and C++ API element.
The reference also explains the important aspects of the GUI of the LN Manager program.
The reference part is followed by a Glossary, which explains many technical terms you might stumble upon.
The final part is an Appendix, which contains some detailed further reference information, such as the syntax of the configuration file.
1.3. If you find Bugs in the Documentation
Documentation can be seen as an important aspect of software. Certainly, errors and lacking parts in the documentation can reduce the usability of any complex software, and we want to make this documentation as good as possible. So, when you find any deficiency, please open a issue and report it as a bug - just tag it with “documentation” and describe what is missing!