###################################
Logging Messages in Links and Nodes
###################################

.. index::
   pair: user guide; logging

What this is about
==================

This chapter describes methods to log and evaluate
message data in LN. They are geared toward a complete
ad-hoc recording of messages which can be used to
analyze or debug a system.

Two methods: LN recorder and Daemon Logger
==========================================

The two methods described here are the LN recorder
and the ln daemon logger. Both allow to
save message data in files, but do have
technical differences.

Main differences
----------------
.. index::
   triple: user guide; logging; differences between ln recorder and ln daemon logger

.. rubric:: LN Recorder Features

The **LN recorder** is just an LN client which is started with a
regexp pattern describing a set of topic or service names. When any of these topics
sends messages, their content is stored as a time series. The same is true for
service requests and responses. To do this,
the LN recorder receives that data as an LN client, so it typically
needs to be transmitted via a network.

Storage is done continuously in a efficient binary format along with
metadata about the messages. In order to avoid stalls of the system or
data loss, the location of storage needs to be on a local file system.

The data format is a special type of raw binary format which is called
LNRDB, which can be easily read into memory-mapped numpy arrays, and
equally easily converted into the :term:`HDF5` format which is widely used for
time series and geophysical data. These data formats both support
memory-mapping, which means that a computer can easily evaluate data
sets which are larger than its RAM.

Because of the continuous storing operation, the LN recorder can store
an almost unlimited duration and amount of data (as much as the local
file system can store). However because data is transported as LN
messages to the client node, there exist certain limits on the maximum
bandwidth with which the data can be transmitted. These limits depend
on both network hardware and connectivity.

.. rubric:: LN Daemon Logger Features

The **LN daemon logger** runs in an LN daemon on a specific host. It
only records data from local topics, so that data can be stored at a very high
bandwidth. When doing the recording, the data is stored in a ring
buffer in-memory, which means that there is a maximum recording size
(and therefore, time) for the data to stay available, and if that
amount is surpassed, the oldest data is erased. The limit depends on
the hardware and RAM of the node. Recording is started and stopped by
LN client API calls, and a
further call instructs the manager to save the data to a file on
the manager node. The possible storage data formats are Matlab ``*.mat``
files, which do have the practical limit that they should not be
larger than the physical RAM of the machine processing the data,
"ln-daemon-log-raw-data" files, "flat python pickle" or lnrecorder-pcap files.

The typical use case for recordings that use the LN daemon logger are
captures of short segments of high-frequency data which is used for
debugging in real-time contexts. Because no network transmission is
necessary at the time of recording, the possible data rates are very
high, which allows to store events and parameters, for example from
hardware controllers, at a high frequency.


The LN Recorder
===============

.. index::
   triple: user guide; logging; ln recorder description
   
As already mentioned, the LN recorder is a client application which
stores data from a set of topics and services as time series into a binary file.

This means that data can be transported across the network like
any other message data.

.. _user_guide/logging/version_info:

Version Information
-------------------

Both :program:`lnrecorder` and the ``lnrdb`` Python library require
at least LN version 2.1.0. Prior versions have bugs which, among
other things, cause that metadata is not written correctly when
the lnrecorder is terminated by a signal. It works together
with older versions of LN.

.. seealso::
   Note on :ref:`faq/synchronization-of-logged-data` in the :doc:`faq` part.


Command Line Arguments
----------------------

The lnrecorder is called with a required "command" argument and
options which indicate topic and service names. The command required to
record messages is "record". This is followed by one or
more patterns for topics in the format ``"topic:<name-pattern>[@frequency]"``
and for services in the format ``"service:<name-pattern>"``.

There, the name pattern is a regexp which matches the
patterns which should be recorded. A star ``"*"`` matches
all topics & services. The topics pattern can be followed by
a frequency option, which is formed of an ``"@"`` followed
by the frequency in Hertz. Using this option requires that
the publishers of the message set the message time stamps,
with a unit of seconds.

A further option is ``"-v"`` for "verbose". In this mode,
the LN recorder prints out any topics which have
been added or removed during its run.

If new topics appear which match a pattern,
they are included in the recording. Each topic
recording is numbered with a global counter,
this means when a topic is started and stopped
multiple times, it is recorded with increasing
counter values.

here the complete output of ``lnrecorder --help``::
  
  usage: lnrecorder [OPTIONS...] VERB [ARGS...]
  
  VERBS can be one of:
  
    record OUTPUT_FILE [name-pattern...]
            start recording of matching topics and services. further arguments
            are intrepreted as glob patterns. in this case only topics or services
            with matching names are recorded. each pattern can be prefixed with 'topic:'
            or 'service:' to only match topics or services.
            prepend a '!' to exclude matching topics/patterns.
            each topic pattern can be followed by a '@'-sign and a subscription rate
            like this: 'large_topics*@3' for 3Hz subscription.
  
            example: 'record topics:* !topics:do_not_log.*' (would record all topics that
            do not start with 'do_not_log.')
  
    replay INPUT_FILE [name-pattern...]
            start replaying topics. further glob-patterns select
            which topics to replay, if not all.
  
    list INPUT_FILE
            list contents of given file
  
    idle
            do nothing, just wait for LN-service requests
  
    convert INPUT_FILE OUTPUT
            convert from one logging format to another
  
    write_msg_defs INPUT_FILE FOLDER
            (unfinished, output topic msg defs to stdout)
  
  OPTIONS can be:
  
    -v      produce more verbose output (repeat up to 3 times)
  
    -s RELATIVE_START_TIMESTAMP
            (only for replay) start replaying at given timestamp
            this timestamp is relative to the logfile start-timestamp
  
    -vs
            show verbose service request/response events when they happen
  
    -name NAME
            use this NAME as prefix for all LN-services
  
    -z --gzip
            filter recorded log through gzip before writing to pcap
    -j --bzip2
            filter recorded log through bzip2 before writing to pcap
    -J --xz
            filter recorded log through xz before writing to pcap

.. only:: rmc
	  
	  (you can get the up-to-date output by calling ``cissy run -p links_and_nodes_runtime/[~2]@common/stable lnrecorder --help``)


Example
-------
.. index::
   triple: user guide; logging; lnrecorder example

Here is an example :file:`lnrdb.lnc` which can be found in the folder :file:`examples/guide/example_logging_lnrdb/`:

.. literalinclude:: examples/guide/example_logging_lnrdb/lnrdb.lnc
   :language: lnc
   :linenos:

The message definitions used are as follows:

:file:`lnrdb_example/two_doubles`


.. literalinclude:: examples/guide/example_logging_lnrdb/ln_msg_defs/lnrdb_example/two_doubles
   :linenos:


:file:`lnrdb_example/two_vectors`
      
.. literalinclude:: examples/guide/example_logging_lnrdb/ln_msg_defs/lnrdb_example/two_vectors
   :linenos:

      
This example starts several processes. process "publisher" publishes
some data, and the process "lnrecorder" records it.

An important point is that the lnrecorder needs to be started
in a shell snippet with "exec", or alternatively with the
"use_execvpe" flag in the process section. This ensures
that it will receive correct signals on termination,
and this in turn makes sure that files with meta data
are properly closed.

.. note:: todo: is this still necessary with LN >= 2.1.0?


Synchronization of logged Data and Time Stamps
----------------------------------------------

.. index::
   triple: user guide; logging; time stamps and data synchronization
   pair: logging; network time protocol
   pair: logging; differences in length of recorded data
   single: time synchronization
   single: time stamps; of logged messages
   single: synchronization; of logged messages
   single: NTP (Network Time Protocol); when to use

The lnrecorder does not ensure that data from different topic
subscribers or processes are logged in a synchronized manner. It
solely records any packets that were captured during its time of
operation. When different subscribers, or different processes are
started, they will not start operation at the same time (the
difference can easily be 0.5 seconds for the subscribe operation
alone). Therefore, both the original times of recorded messages, and
the number of messages are expected to differ.

However, the lnrecorder records the time stamps which are passed when
any packets are sent. These refer to the system clock, and allow for a
precise synchronization of different messages, and aligning related
data. If recording from different hosts need to be related in terms of
time, it must be ensured that their system clocks are synchronized via
network time protocol (:term:`NTP`) or other means.

Each lnrdb table for a recorded topic includes these two timestamp columns:

- ``timestamp`` local system time when lnrecorder got hold of this topic message. It corresponds to the
  ``_packet_log_ts`` in ln_daemon recordings.

- ``publisher_timestamp`` system time on publisher host when message was published. It corresponds to the
  ``_packet_source_ts`` in ln_daemon recordings.

.. note:: the ``timestamp`` is usually of less importance: it will show more jitter, drift and probably lags behind by
	  the time needed for transfer between hosts...

.. seealso::
   Note on :ref:`faq/synchronization-of-logged-data` in the :doc:`faq` part.

   Description of time stamp for the ``ln_read()`` function in
   :ref:`reference/c/ln_read`.

LN Service Interface aka idle-mode
----------------------------------

when lnrecorder is started like this: ``lnrecorder idle``, it will only react to these
LN-service calls:

- ``<name>.record`` using md ``lnrecorder/record``
- ``<name>.record_waiting`` using md ``lnrecorder/record_waiting``
- ``<name>.stop`` using md ``lnrecorder/stop``
- ``<name>.replay`` using md ``lnrecorder/replay``

``<name>`` defaults to ``lnrecorder`` but if started via ln-manager it will be the name
fo the process definition which started the lnrecorder.

when calling the ``.record``-service you need to provide these fields:

- ``filename`` name of to be created lnrdb or ``.pcap``-file
- ``patterns`` is a list of ``ln/string``. here you can specify exactly the same
  topic- / service-name-patterns as you would do on the command line (including the
  ``@``-rate-limiter for topics). if this list is kept empty, all topics & services will be logged!

a service response with empty ``error_message`` field indicates that the lnrecorder is now recording.
The ``.record`` service waits up to 10 seconds until topics that existed when recording started have
received a first packet, but it still reports success if that maximum wait time expires.

The ``.record_waiting`` service uses the same fields as ``.record`` and adds:

- ``wait_timeout`` maximum time in seconds to wait until initially existing topics have received
  packets.

The ``.record_waiting`` response includes ``pending_topics``, a list of initially existing topics that
had not received packets when ``wait_timeout`` expired. The recording has already started; callers can
decide whether a non-empty ``pending_topics`` list is an error for their use case.

to stop a recording, you should call the ``.stop`` LN-service. it does not need any request fields.

after the lnrecorder is again in its stopped-state, you can either start another recording or start
a replay:

to replay a log you call the ``.replay`` LN-service and provide these request fields:

- ``filename`` is the name of the lnrdb or ``.pcap``-file
- ``patterns`` is a list of ``ln/string`` and is optional. use it to only replay a selection of
  the recorded topics / services. if not specified, all topics / services are replayed.
- ``start_time`` start replaying from a given time-offset (seconds) into the recording. use ``0`` to
  start from the beginning.
- ``stop_time`` stop playback at given time in seconds after recording-start. set to ``0`` to play
  until end of recording.
- ``wait`` optional. if set to ``1`` the service call will wait until the replay is finished.
  otherwise the service call will return as soon as the replay was started

a replay can also be stopped via the .stop LN-service from above.


Data Accesss
------------

.. index::
   triple: user guide; logging; accessing data from lnrecorder


Reading logged Data in LNRDB Format into Python
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


The process "python example read log" reads the generated log data,
and prints some values. It uses the "lnrdb" python module
from the links_and_nodes_python package, ands its listing
looks like this:

.. literalinclude:: examples/guide/example_logging_lnrdb/read_log.py
   :language: python
   :linenos:


An important aspect of the lnrdb library is its capability
to return memory-mapped numpy arrays. This means that
it is possible to access time series which are far larger
than a size that would fit into memory, but in the same
way as a "normal" memory-backed numpy array. Data is
read from disk only on-demand and can also be modified.
      

.. index::
   triple: user guide; logging; convert data into hdf5 format
   pair: hdf5 format; logging


Converting Logged Data to HDF5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The process "convert log" reads the LNRDB binary
data and saves it in :term:`HDF5` format. HDF5 is a scientific data
format that was developed for time series of large amounts of
geophysical data, for example. One advantage it has is that
it can hold any relevant metadata, and also that 
it can be easily converted into other formats.

The converter script shown here takes at least two arguments:
The first is the hdf5 file which is to be generated as output,
and the following ones are the lnrdb recordings which should
be included in the hdf5 file. That file acts like a kind of
archive and container and can hold multiple time series
as well as their meta data.


.. literalinclude:: examples/guide/example_logging_lnrdb/convert_logs.py
   :language: python
   :linenos:


Once the conversion is done, no dependencies on LN or the
real-time run-time system are needed, which means that
the data can be analysed off-line if desired.

there is also a ready to use conan-packaged stand-alone converter: ``lnrdb_to_hdf5_converter/1.0.0@common/unstable``
for details check here https://rmc-github.robotic.dlr.de/common/lnrdb_to_hdf5_converter

Reading HDF5 Data
~~~~~~~~~~~~~~~~~

As with the lnrdb module, such HDF5 files can be used to access
memory-mapped binary data. The script "read_hdf5.py" shows how this
can be done:


.. literalinclude:: examples/guide/example_logging_lnrdb/read_hdf5.py
   :language: python
   :linenos:


.. index::
   triple: user guide; logging; reading lnrecorder data into Matlab

      
Reading HDF5 into Matlab
~~~~~~~~~~~~~~~~~~~~~~~~

The script in :file:`matlab/example_hdf5.m` shows how a :term:`HDF5`
file can be accessed in matlab.

.. literalinclude:: examples/guide/example_logging_lnrdb/matlab/example_hdf5.m
   :language: matlab
   :linenos:

Here, the hdf5 reader routine returns matlab matrix objects. In
difference to Numpy, these objects are not memory-mapped. This
means that if the time series is very large, only a selection of
it can be read into memory at the same time (Matlab is capable
to access memory-mapped data, but only with low-level routines).


Direct Access to LNRDB Data in Matlab
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As a last example, we show some scripts which read binary LNRDB data
directly in matlab:

.. literalinclude:: examples/guide/example_logging_lnrdb/matlab/example.m
   :language: matlab
   :linenos:

.. literalinclude:: examples/guide/example_logging_lnrdb/matlab/read_topic1.m
   :language: matlab
   :linenos:
      
.. literalinclude:: examples/guide/example_logging_lnrdb/matlab/read_topic2.m
   :language: matlab
   :linenos:

The offsets need to be computed from the message definitions.
      

As we see, this approach is more fragile, since it needs to use
manually computed direct offsets into the binary files, which depend
on the message definitions. Any change to the message definitions will
render these offsets invalid, requiring consistent updates to avoid
reading invalid binary data.

.. only:: rmc

	  .. tip::

	     A repository with this full example in runnable form is available under

	     https://rmc-github.robotic.dlr.de/schm-fl/lnrdb_example/tree/convert-hdf5
	     
	     (branch convert-hdf5)

	     
.. only:: rmc

	  Using the LN recorder with Cissy
	  ================================

	  If you want to use the LN recorder with cissy, this cissy workspace
	  file and LN configuration can be used as a template. Both together
	  ensure that the right minimum version of lnrecorder and lnrdb python
	  module are used:

	  :file:`cissy_workspace.yaml`

	  .. literalinclude:: examples/guide/example_logging_lnrdb/cissy/cissy_workspace.yaml
	     :language: yaml
	     :linenos:


	  :file:`lnrdb_cissy.lnc`
		
	  .. literalinclude:: examples/guide/example_logging_lnrdb/cissy/lnrdb_cissy.lnc
	     :language: lnc
	     :linenos:
		
Troubleshooting
~~~~~~~~~~~~~~~

.. index::
   triple: user guide; logging; troubleshooting missing metadata


If any metadata files appear to be missing, check that the right
LN version >= 2.1.0 is used, and that the lnrecorder program
is started either with the "exec"  shell command, or with the
"execvpe" flag in the LNM configuration for the process!

The LN Daemon Logger
====================

.. index::
   triple: user guide; logging; example LN daemon logger


The LN daemon logger can be accessed by the LN client API,
which allows to instruct an LN daemon to log message data
to an `circular buffer <https://en.wikipedia.org/wiki/Circular_buffer>`_ in
RAM. It is run from an LN client, and instructs the daemon on a given node to
log named topics.

Here is an example script how to use it:

.. literalinclude:: examples/guide/example_logging_daemon_logger/lndaemon_logger.py
   :language: python
   :linenos:


Here, ``logger.add_topic()`` configures logging for two topics: ``test.topic1``
and ``test.topic2``, for each topic the last 100000 samples will be kept in
RAM. ``logger.enable()`` starts logging, and ``logger.disable()`` stops
it. ``logger.manager_save(filename)`` instructs the LN manager to fetch the
collected data from the daemons and save it to the given file.

The default is to log the "source shared memory" on the host the publisher runs
on. Each shared-memory can only have one logger running at a time. If you have a subscriber on a different host or a subscriber with
``subscriber-rate != -1`` they will get their own shared-memory. If you want to
log data that arrives there, you need to specify a topic-name in the form of
``.add_topic("subscriber:<CLIENT-NAME>[@<SUB-RATE>]:<TOPIC-NAME>)`` where
``<CLIENT-NAME>`` is the unique name of the ln-client doing that subscription
(check the ln_manager's "clients"-tab). specifying a ``<SUB-RATE>`` is only
useful if that client does multiple subscriptions of the same topic (not many
use-cases).

In the example, the file name terminaes with "pcap", so that PCAP format is
used. The logger can also save in Matlab's ``.mat`` format or python's
``.pickle``.

At the time being, recording of complex data structures is probably
done best with the LN recorder with its "lnrdb" format that
can be converted into :term:`HDF5` if needed.


.. index::
   triple: user guide; logging; comparison lnrecorder vs ln daemon logger


LN Recorder and Daemon Logger: When to use Which One
====================================================

=============================  ================================  =======================================
aspect                         LN recorder                       Daemon logger
=============================  ================================  =======================================
max duration                   huge; limited by file system      limited by ram of publisher-node
max data rate                  limited by network                very fast; limited by shared memory
suitable for complex data      yes                               less
stores meta data               yes                               no
supported file formats         LNRDB, :term:`HDF5`, PCAP         PICKLE, PCAP, MAT, RAW
memory-mapped view avilable    yes                               no (could be done with RAW format)
typical usage                  recording motions, I/O, workflow  debugging hardware drivers, controllers
=============================  ================================  =======================================

.. only:: rmc

	  .. tip::

	     Another option for logging is the "monitor" package,
	     which provides logging of status messages that is similar
	     to syslog message recording and transmission.