============== Introduction ============== .. contents:: This chapter has the purpose to give an introduction what Links and Nodes is in general, and what specific problems it does address. .. index:: pair: Links and Nodes; overview on central components It also gives an outline which are the central components of Links and Nodes, explaining important terms, and finally gives an overview which information can be found in the following chapters, and where one might to proceed reading depending on their level of prior knowledge and need for detail. What is "Links and Nodes"? ========================== .. index:: single: Links and Nodes; what it is Links and Nodes (which we usually abbreviate as LN) is a :term:`middleware` to create and manage flexible distributed :term:`real-time` systems. It was created to develop and control :term:`embedded ` robotic systems. So, what challenges does Links and Nodes address? Why should one undergo the trouble to use it? .. _introduction/rationale: What makes it difficult to set up distributed robotic systems? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. index:: single: Links and Nodes; what problems does it solve? single: processes; introduction single: memory protection; introduction single: threads programming single: concurrency bugs single: isolation from user environment single: user environment; why it can cause trouble pair: introduction; dependency management pair: introduction; process management There are a whole number of challenges in setting up experimental robotic systems, which Links and Nodes addresses. The most important ones are: * Such systems consist of many different parts, often written by different people in different programming languages. Typically, the parts are divided into :term:`processes `. Processes are isolated from each other, in that each one has its own memory address space, and cannot access memory from other processes. This so-called :term:`memory protection` is intentional: They are easier to develop and debug this way. Somehow, the processes need to communicate. * It is possible to write low-level communication routines, using OS facilities such as System V :term:`shared memory` communication, or :term:`pthreads`. But programming these is difficult and error-prone if one is not very careful. For example it is easy to create :term:`race conditions `, which are often very difficult to debug because they lead to :term:`non-deterministic` errors and are, depending on the programming language, a form of :term:`undefined behavior`. And spending time on debugging does not provide very much value to what the robotic system should achieve or show. * If one uses low-level OS facilities for communication, setting up such a system requires a lot of repetitive code which is hard to oversee. They are also not well-supported in many programming languages which e.g. specialize on numerics or on ease-of-use. * As configurations are also quickly becoming quite complex, it becomes difficult to understand how a system is configured in detail. * When many people work together, everyone will use their individual :term:`user environment` configuration. Usually, there are multiple ways of altering a configuration and multiple places where changes can be stored. With the number of shared and re-used components, the number of participants also grows quickly. While convenient for single users, this can make it very difficult to provide a clearly-defined working environment. And this can make it also difficult to achieve that the system behaves in a :term:`deterministic and reproducible way ` - what works today might not work tomorrow, or for a different user. * Hardware and how it is connected via :term:`networks ` changes according to tasks and over time. This means that manual configurations need to be adapted to which nodes run in a network, how they are connected, and so on. * Processes which have :term:`real-time` requirements need to have :term:`priorities ` assigned and managed * It is also often necessary that one process is up and running before other processes which depend on them can start working. We say, the second process :term:`depends ` in the first one. Features and design principles of LN ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. index:: single: Links and Nodes; features and main principles pair: LN manager; configuration file pair: introduction; network transparency single: reproducibility The following are some features of Links and Nodes which address these challenges and make it suited to robotic applications: Main features and principles ++++++++++++++++++++++++++++ Links and Nodes * displays and controls process state via a :term:`GUI` or a command line interface. * uses a single :term:`configuration source ` to organize a system, which is in text form and can be managed by version control. * supports :term:`real-time` operation. For example, it can transmit :term:`messages ` between processes on the same computer nearly instantly (within microseconds). * requires no knowledge of :term:`threads programming ` to use it. * supports different programming languages and environments. Currently, they include C, C++, Python2, Python3, and Matlab/Simulink. * especially, it isolates processes from the particular :term:`user environment `, so that different personal configurations are less likely to break a setup. * aims to provide a fully :term:`reproducible ` setup. * makes :term:`process management` and communication :term:`network-transparent `, which means that it makes no logical difference where some of the processes run. .. Commenting out these because it gets too long .. .. Further features which help .. --------------------------- .. * it makes it easy to set-up and control demonstrations of systems .. * it only needs TCP connections in order to work on different .. hosts. .. * it presents and manages different systems and architectures in .. a homogeneous ways, abstracting away differences in handling. .. * It uses name spaces and groups to group processes and messages, .. which makes it easier to compose complex systems out of simpler .. components, and to manage these components. .. * it avoids redundancy and boiler-plate code .. * it allows to define templates for process configuration, to help .. avoid repetitive configurations .. * It abstracts nodes and network properties in the configuration, .. which makes it easier to run a system on different hardware or at .. another site .. How all of these principles are implemented and how these features are used goes beyond the scope of this introduction. They will be discussed later in the :doc:`user_guide` part. .. _overview/functions: Overview on the functions provided by Links and Nodes +++++++++++++++++++++++++++++++++++++++++++++++++++++ .. index:: pair: Links and Nodes; main functions single: process management in Links and Nodes single: process dependencies; introduction single: priority management; introduction single: nodes of a distributed system; introduction LN provides two main functions: * **Process management**: This allows to control process and their inter-dependencies, for example starting and stopping them in a defined order, handling errors, looking at their output, assigning scheduling priorities, sending :term:`signals `, and so on. The result is similar to a kind of init system for a robotic system, similar to what `systemd `_ does on a general Unix system, but much more flexible, because it can be controlled interactively. There are several central aspects which Links and Nodes takes care of: * **Process dependencies**: For example, it allows to start processes which depend on each other, and then one can modify the source code of one of these processes, re-compile and restart them, and other processes which depend on it will be restarted as needed. In this way, it supports flexible development of robotic processes. * **Priorities**: In embedded real-time systems, it matters which process proceeds if there are several tasks which one CPUs can run at the same time. For example, one process might contain a control loop which monitors basic safety parameters, and another one might check the battery state of a robot and initiate actions if the battery runs low. Priorities define which processes are more important in the sense that the CPU gets quicker to work on them. * **Nodes**: A distributed system often has several nodes which cooperate to do some work. They could be different computers with different CPUs and operating systems. The process management knows which system should run which process, and how to access and start them, so that each part works together like a single computer. .. _intro/communication: .. index:: single: Communication in Links and Nodes single: topics; introduction single: services; introduction * **Communication**: LN provides an easy way for these processes to work together by sending each other defined pieces of data. For this, it primarily uses a `message-passing approach `_: One process prepares such a piece of data, and when it is ready with that preparation, it calls a LN library routine which transmits that data to the recipient process (or a number of recipient processes). The recipient processes can use another library routine to read this data once it has arrived. This is a widely used, quite easy and safe paradigm: Programmers which use the library routines do not need to take care of :term:`multi-threading`, locking, :term:`deadlocks `, :term:`race conditions `, and many things more which make multi-threaded programs often very difficult to write and debug. Communication patterns supported by Links and Nodes are of several different types. The two most important ones are: * **Topics**: Topics implement a `publisher/subscriber pattern `_. This means, a program can make a piece of data available, which is regularly updated (usually, as step in a cyclic process). The data is attached to a name, which other programs can refer to to access that data. Each data label can be written to only by one program, but many other programs can read from it, so it can be used for broadcasting data. The data transmission is highly efficient and suitable for real-time applications. The transmission can be either reliable (which means any failures are corrected), or unreliable (which might have a speed advantage). For topics, the data flow is unidirectional, data goes always from the sender to the receiver. * **Services**: Services implement a more generalized form of a function call. With a normal function call, some code is called within a program, which gets passed some arguments, and returns a return value. Usually, the code of the function runs on the same node or computer as the calling code. Services however implement a form of `remote procedure call `_, which means that the call can run code on another node in the system, by transmitting the function arguments, running the computation, and transmitting the result back. This can also be very useful if the code runs on the same node, but in another process, because the calling process and the callee are separated and the programmer does not need to bother with threading issues. (This is not an exhaustive list. LN provides additional services, which are explained later in the chapter :doc:`user_guide`.) .. _overview/main_components: Main components of Links and Nodes Applications +++++++++++++++++++++++++++++++++++++++++++++++ .. index:: pair: Links and Nodes; main components single: LN manager; introduction single: LN clients; introduction single: message definitions; introduction single: LN daemon; introduction Distributed systems which use Links and Nodes consist of two three components and a fourth one which usually works quietly in the background: * **The LN Manager**: The LN Manager is the central program which manages processes, knows how to start them, their dependencies, whether they are running, and so on. It is configured by a central configuration file in text form. Each distributed system runs one LN Manager which is started for a specific project or system (but several LN managers for different systems can run on the same hosts). Normally, the LN manager presents as a GUI program which displays processes and allows to control them. It also has a powerful command line interface. * **LN Clients**: LN Clients are processes which use the LN Client library to communicate and exchange messages. Typically, they are managed by the LN Manager (however the LN manager can manage any kind of process, not just LN clients). In order to communicate, they use the client library API for the language in which the client is written. * **Message Definitions**: The LN clients which communicate need to have a common understanding about the data which they exchange, even across different hosts, architectures, and programming languages. To achieve that, messages are defined as plain text files, and they are identified when a specific client is started and wants to communicate with other clients. Within each client programming language, the message definitions are represented as simple packets of structured data, for example as a ``struct`` in C and C++, or as members of a class instance in Python. The translation from elements of a message definition to such data elements happens automatically and in an efficient way. * **LN daemon** The LN daemon is a program which runs in the background. Its purpose is to serve as a hub for exchanging messages and setting up connections and communication. Normally, it is automatically started on each host on which LN clients are used, and the LN Manager takes care of that. Unless there are errors or difficulties with a setup, you would not need to worry about this. Quick Overview on the documentation, and where to find what =========================================================== .. index:: pair: Documentation; quick overview pair: Documentation; quickstart sections pair: Documentation; tutorials pair: Documentation; where to find specific things This section gives an overview on the documentation. For readers who just want to jump into it, there is a :doc:`quickstart` section which just gives an essential example with as little code as possible, and (depending on the background you have) should be sufficient to understand main concepts, run the examples, and experiment with them. These examples use Python and C++. As a somewhat deeper introduction, we provide a :doc:`tutorial` for Python and C++. Where it is convenient, it refers to the quickstart example. However, it takes care to not require prior knowledge, and explains in a step-by-step manner how the code fits together, how main components work, and what are important points to observe. They explain also what are the central concepts behind the presented functions, and point to where further documentation to these basic functions can be found. .. index:: pair: Documentation; user guide pair: debugging LN clients; introduction The main part is the (not yet finished) :doc:`user_guide`, which consists of two sub-parts: The first explains the important concepts in more detail. This includes the basic architecture of the system, the components and how they relate to each other. The last chapter is an explanation of how to use the APIs of the LN client library and its various language bindings, which support Python, C++, and C. It has also an extended section :doc:`on debugging LN clients `, which might be useful for programmers not familiar with the techniques. The User Guide is then followed by the :doc:`reference` part, which is a programming reference which describes each API function and component in detail. Because the other languages bindings are based on C, the C language bindings will provide the most detail; these are linked in each Python and C++ API element. The reference also explains the important aspects of the GUI of the LN Manager program. The reference part is followed by a :doc:`glossary`, which explains many technical terms you might stumble upon. The final part is an :doc:`appendix`, which contains some detailed further reference information, such as the syntax of the configuration file. .. index:: single: documentation bugs; how to report If you find Bugs in the Documentation ===================================== Documentation can be seen as an important aspect of software. Certainly, errors and lacking parts in the documentation can reduce the usability of any complex software, and we want to make this documentation as good as possible. So, when you find any deficiency, please open a issue and report it as a bug - just tag it with "documentation" and describe what is missing!