==============
 Introduction
==============

.. contents::

This chapter has the purpose to give an introduction what Links and
Nodes is in general, and what specific problems it does address.

.. index::   
   pair: Links and Nodes; overview on central components

It also gives an outline which are the central components of Links and
Nodes, explaining important terms, and finally gives an overview which
information can be found in the following chapters, and where one
might to proceed reading depending on their level of prior knowledge
and need for detail.
  
What is "Links and Nodes"?
==========================

.. index::
   single: Links and Nodes; what it is


Links and Nodes (which we usually abbreviate as LN) is a
:term:`middleware` to create and manage flexible distributed
:term:`real-time` systems. It was created to develop and control
:term:`embedded <embedded system>` robotic systems.


So, what challenges does Links and Nodes address? Why should one
undergo the trouble to use it?

.. _introduction/rationale:

What makes it difficult to set up distributed robotic systems?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. index::
   single: Links and Nodes; what problems does it solve?
   single: processes; introduction
   single: memory protection; introduction
   single: threads programming
   single: concurrency bugs
   single: isolation from user environment
   single: user environment; why it can cause trouble
   pair: introduction; dependency management
   pair: introduction; process management


There are a whole number of challenges in setting up experimental
robotic systems, which Links and Nodes addresses. The most important
ones are:

* Such systems consist of many different parts, often written by
  different people in different programming languages. Typically, the
  parts are divided into :term:`processes <process>`. Processes are
  isolated from each other, in that each one has its own memory
  address space, and cannot access memory from other processes. This
  so-called :term:`memory protection` is intentional: They are easier
  to develop and debug this way.  Somehow, the processes need to
  communicate.

* It is possible to write low-level communication routines, using OS
  facilities such as System V :term:`shared memory` communication, or
  :term:`pthreads`. But programming these is difficult and error-prone
  if one is not very careful. For example it is easy to create
  :term:`race conditions <race condition>`, which are often very
  difficult to debug because they lead to :term:`non-deterministic`
  errors and are, depending on the programming language, a form of
  :term:`undefined behavior`. And spending time on debugging does not
  provide very much value to what the robotic system should achieve or
  show.

* If one uses low-level OS facilities for communication, setting up
  such a system requires a lot of repetitive code which is hard to
  oversee. They are also not well-supported in many programming
  languages which e.g. specialize on numerics or on ease-of-use.
  
* As configurations are also quickly becoming quite complex,
  it becomes difficult to understand how a system
  is configured in detail.

* When many people work together, everyone will use their individual
  :term:`user environment` configuration. Usually, there are multiple
  ways of altering a configuration and multiple places where changes can
  be stored.  With the number of shared and re-used components, the
  number of participants also grows quickly. While convenient for single
  users, this can make it very difficult to provide a clearly-defined
  working environment.  And this can make it also difficult to achieve
  that the system behaves in a :term:`deterministic and reproducible way
  <reproducibility>` - what works today might not work tomorrow, or for
  a different user.
  
* Hardware and how it is connected via :term:`networks <network>`
  changes according to tasks and over time. This means that manual
  configurations need to be adapted to which nodes run in a network,
  how they are connected, and so on.

* Processes which have :term:`real-time` requirements need to have
  :term:`priorities <process priorities>` assigned and managed

* It is also often necessary that one process is up and running
  before other processes which depend on them can start working.
  We say, the second process :term:`depends <dependency>` in the first one.
  
Features and design principles of LN
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. index::
   single: Links and Nodes; features and main principles
   pair: LN manager; configuration file
   pair: introduction; network transparency
   single: reproducibility


The following are some features of Links and Nodes which address these
challenges and make it suited to robotic applications:


Main features and principles
++++++++++++++++++++++++++++

Links and Nodes

* displays and controls process state via a :term:`GUI` or a command line
  interface.
* uses a single :term:`configuration source <configuration file>`
  to organize a system, which is in text form and can be managed by
  version control.
* supports :term:`real-time` operation. For example, it can transmit
  :term:`messages <message>` between processes on the same computer
  nearly instantly (within microseconds).
* requires no knowledge of :term:`threads programming <threads>` to
  use it.
* supports different programming languages and
  environments. Currently, they include C, C++, Python2, Python3, and
  Matlab/Simulink.
* especially, it isolates processes from the particular :term:`user
  environment <environment>`, so that different personal configurations
  are less likely to break a setup.
* aims to provide a fully :term:`reproducible <reproducibility>` setup.
* makes :term:`process management` and communication
  :term:`network-transparent <network transparency>`, which means that
  it makes no logical difference where some of the processes run.

..  Commenting out these because it gets too long
..    
..  Further features which help
..  ---------------------------
..  * it makes it easy to set-up and control demonstrations of systems
..  * it only needs TCP connections in order to work on different
..    hosts.
..  * it presents and manages different systems and architectures in
..    a homogeneous ways, abstracting away differences in handling.
..  * It uses name spaces and groups to group processes and messages,
..    which makes it easier to compose complex systems out of simpler
..    components, and to manage these components.
..  * it avoids redundancy and boiler-plate code
..  * it allows to define templates for process configuration, to help
..    avoid repetitive configurations 
..  * It abstracts nodes and network properties in the configuration,
..    which makes it easier to run a system on different hardware or at
..    another site
..  

How all of these principles are implemented and how these features are
used goes beyond the scope of this introduction. They will be
discussed later in the :doc:`user_guide` part.


.. _overview/functions:

Overview on the functions provided by Links and Nodes
+++++++++++++++++++++++++++++++++++++++++++++++++++++

.. index::
   pair: Links and Nodes; main functions
   single: process management in Links and Nodes
   single: process dependencies; introduction
   single: priority management; introduction
   single: nodes of a distributed system; introduction


LN provides two main functions:

* **Process management**: This allows to control process and their
  inter-dependencies, for example starting and stopping them in a
  defined order, handling errors, looking at their output, assigning
  scheduling priorities, sending :term:`signals <UNIX signal>`, and so on. The result is
  similar to a kind of init system for a robotic system, similar to
  what `systemd <https://en.wikipedia.org/wiki/Systemd>`_ does on a
  general Unix system, but much more flexible, because it can be
  controlled interactively.

  There are several central aspects which Links and Nodes takes
  care of:

  * **Process dependencies**: For example, it allows to start processes
			   which depend on each other, and then one
			   can modify the source code of one of these
			   processes, re-compile and restart them, and
			   other processes which depend on it will be
			   restarted as needed. In this way, it
			   supports flexible development of robotic
			   processes.
			   
  * **Priorities**: In embedded real-time systems, it matters which
	       process proceeds if there are several tasks which one
	       CPUs can run at the same time. For example, one
	       process might contain a control loop which monitors
	       basic safety parameters, and another one might check
	       the battery state of a robot and initiate actions if
	       the battery runs low. Priorities define which processes
	       are more important in the sense that the CPU gets
	       quicker to work on them.

  * **Nodes**: A distributed system often has several nodes which cooperate
	  to do some work. They could be different computers with
	  different CPUs and operating systems. The process management
	  knows which system should run which process, and how to
	  access and start them, so that each part works together like
	  a single computer.

.. _intro/communication:	  

.. index::
   single: Communication in Links and Nodes
   single: topics; introduction
   single: services; introduction


* **Communication**: LN provides an easy way for these processes to
  work together by sending each other defined pieces of data.  For
  this, it primarily uses a `message-passing approach
  <https://en.wikipedia.org/wiki/Message_passing>`_: One process
  prepares such a piece of data, and when it is ready with that
  preparation, it calls a LN library routine which transmits that data
  to the recipient process (or a number of recipient processes). The
  recipient processes can use another library routine to read this
  data once it has arrived. This is a widely used, quite easy and safe
  paradigm: Programmers which use the library routines do not need to
  take care of :term:`multi-threading<threads>`, locking,
  :term:`deadlocks <deadlock>`, :term:`race conditions <race
  condition>`, and many things more which make multi-threaded programs
  often very difficult to write and debug.

  Communication patterns supported by Links and Nodes are of several
  different types. The two most important ones are:

  * **Topics**: Topics implement a `publisher/subscriber pattern
    <https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern>`_.
    This means, a program can make a piece of data available, which is
    regularly updated (usually, as step in a cyclic process).  The
    data is attached to a name, which other programs can refer to to
    access that data. Each data label can be written to only by one
    program, but many other programs can read from it, so it can be
    used for broadcasting data.  The data transmission is highly
    efficient and suitable for real-time applications.  The
    transmission can be either reliable (which means any failures are
    corrected), or unreliable (which might have a speed advantage).
    
    For topics, the data flow is unidirectional, data
    goes always from the sender to the receiver.

  * **Services**: Services implement a more generalized form of a
    function call. With a normal function call, some code is called
    within a program, which gets passed some arguments, and returns a
    return value. Usually, the code of the function runs on the same
    node or computer as the calling code.  Services however implement
    a form of `remote procedure call
    <https://en.wikipedia.org/wiki/Remote_procedure_call>`_, which
    means that the call can run code on another node in the system, by
    transmitting the function arguments, running the computation, and
    transmitting the result back. This can also be very useful if the
    code runs on the same node, but in another process, because the
    calling process and the callee are separated and the programmer
    does not need to bother with threading issues.


(This is not an exhaustive list. LN provides additional services,
which are explained later in the chapter :doc:`user_guide`.)

.. _overview/main_components:

Main components of Links and Nodes Applications
+++++++++++++++++++++++++++++++++++++++++++++++

.. index::
   pair: Links and Nodes; main components
   single: LN manager; introduction
   single: LN clients; introduction
   single: message definitions; introduction
   single: LN daemon; introduction


Distributed systems which use Links and Nodes consist of two three components and a fourth one which usually works quietly in the background:

* **The LN Manager**: The LN Manager is the central program which manages
  processes, knows how to start them, their dependencies, whether
  they are running, and so on.

  It is configured by a central configuration file in text form. Each
  distributed system runs one LN Manager which is started for a
  specific project or system (but several LN managers for different
  systems can run on the same hosts).

  Normally, the LN manager presents as a GUI program which displays
  processes and allows to control them. It also has a powerful
  command line interface.

* **LN Clients**: LN Clients are processes which use the LN Client
  library to communicate and exchange messages.

  Typically, they are managed by the LN Manager (however the LN
  manager can manage any kind of process, not just LN clients). In
  order to communicate, they use the client library API for the
  language in which the client is written.

* **Message Definitions**:

  The LN clients which communicate need to have a common
  understanding about the data which they exchange, even across
  different hosts, architectures, and programming languages.

  To achieve that, messages are defined as plain text files, and they
  are identified when a specific client is started and wants to
  communicate with other clients. Within each client programming
  language, the message definitions are represented as simple packets
  of structured data, for example as a ``struct`` in C and C++, or as
  members of a class instance in Python. The translation from elements
  of a message definition to such data elements happens automatically
  and in an efficient way.

* **LN daemon** The LN daemon is a program which runs in the
  background. Its purpose is to serve as a hub for exchanging
  messages and setting up connections and communication.

  Normally, it is automatically started on each host on
  which LN clients are used, and the LN Manager takes care
  of that. Unless there are errors or difficulties with a
  setup, you would not need to worry about this.


Quick Overview on the documentation, and where to find what
===========================================================

.. index::
   pair: Documentation; quick overview
   pair: Documentation; quickstart sections
   pair: Documentation; tutorials
   pair: Documentation; where to find specific things


This section gives an overview on the documentation.

For readers who just want to jump into it, there is a
:doc:`quickstart` section which just gives an essential example with
as little code as possible, and (depending on the background you have)
should be sufficient to understand main concepts, run the examples,
and experiment with them. These examples use Python and C++.

As a somewhat deeper introduction, we provide a :doc:`tutorial` for
Python and C++. Where it is convenient, it refers to the quickstart
example. However, it takes care to not require prior knowledge, and
explains in a step-by-step manner how the code fits together, how main
components work, and what are important points to observe. They
explain also what are the central concepts behind the presented
functions, and point to where further documentation to these basic
functions can be found.

.. index::
   pair: Documentation; user guide
   pair: debugging LN clients; introduction
   
The main part is the (not yet finished) :doc:`user_guide`, which
consists of two sub-parts: The first explains the important concepts
in more detail.  This includes the basic architecture of the system,
the components and how they relate to each other. The last chapter is
an explanation of how to use the APIs of the LN client library and its
various language bindings, which support Python, C++, and C. It has
also an extended section :doc:`on debugging LN clients
<user_guide_debugging_cpp>`, which might be useful for programmers not
familiar with the techniques.


The User Guide is then followed by the :doc:`reference` part, which is a
programming reference which describes each API function and component
in detail. Because the other languages bindings are based on C, the C
language bindings will provide the most detail; these are linked in each
Python and C++ API element.

The reference also explains the important aspects of the GUI of the LN
Manager program. 

The reference part is followed by a :doc:`glossary`, which explains
many technical terms you might stumble upon.

The final part is an :doc:`appendix`, which contains some detailed
further reference information, such as the syntax of the configuration
file.


.. index::
   single: documentation bugs; how to report

If you find Bugs in the Documentation
=====================================

Documentation can be seen as an important aspect of
software. Certainly, errors and lacking parts in the documentation can
reduce the usability of any complex software, and we want to make this
documentation as good as possible.  So, when you find any deficiency,
please open a issue and report it as a bug - just tag it with
"documentation" and describe what is missing!