################################### Logging Messages in Links and Nodes ################################### .. index:: pair: user guide; logging What this is about ================== This chapter describes methods to log and evaluate message data in LN. They are geared toward a complete ad-hoc recording of messages which can be used to analyze or debug a system. Two methods: LN recorder and Daemon Logger ========================================== The two methods described here are the LN recorder and the ln daemon logger. Both allow to save message data in files, but do have technical differences. Main differences ---------------- .. index:: triple: user guide; logging; differences between ln recorder and ln daemon logger .. rubric:: LN Recorder Features The **LN recorder** is just an LN client which is started with a regexp pattern describing a set of topic or service names. When any of these topics sends messages, their content is stored as a time series. The same is true for service requests and responses. To do this, the LN recorder receives that data as an LN client, so it typically needs to be transmitted via a network. Storage is done continuously in a efficient binary format along with metadata about the messages. In order to avoid stalls of the system or data loss, the location of storage needs to be on a local file system. The data format is a special type of raw binary format which is called LNRDB, which can be easily read into memory-mapped numpy arrays, and equally easily converted into the :term:`HDF5` format which is widely used for time series and geophysical data. These data formats both support memory-mapping, which means that a computer can easily evaluate data sets which are larger than its RAM. Because of the continuous storing operation, the LN recorder can store an almost unlimited duration and amount of data (as much as the local file system can store). However because data is transported as LN messages to the client node, there exist certain limits on the maximum bandwidth with which the data can be transmitted. These limits depend on both network hardware and connectivity. .. rubric:: LN Daemon Logger Features The **LN daemon logger** runs in an LN daemon on a specific host. It only records data from local topics, so that data can be stored at a very high bandwidth. When doing the recording, the data is stored in a ring buffer in-memory, which means that there is a maximum recording size (and therefore, time) for the data to stay available, and if that amount is surpassed, the oldest data is erased. The limit depends on the hardware and RAM of the node. Recording is started and stopped by LN client API calls, and a further call instructs the manager to save the data to a file on the manager node. The possible storage data formats are Matlab ``*.mat`` files, which do have the practical limit that they should not be larger than the physical RAM of the machine processing the data, "ln-daemon-log-raw-data" files, "flat python pickle" or lnrecorder-pcap files. The typical use case for recordings that use the LN daemon logger are captures of short segments of high-frequency data which is used for debugging in real-time contexts. Because no network transmission is necessary at the time of recording, the possible data rates are very high, which allows to store events and parameters, for example from hardware controllers, at a high frequency. The LN Recorder =============== .. index:: triple: user guide; logging; ln recorder description As already mentioned, the LN recorder is a client application which stores data from a set of topics and services as time series into a binary file. This means that data can be transported across the network like any other message data. .. _user_guide/logging/version_info: Version Information ------------------- Both :program:`lnrecorder` and the ``lnrdb`` Python library require at least LN version 2.1.0. Prior versions have bugs which, among other things, cause that metadata is not written correctly when the lnrecorder is terminated by a signal. It works together with older versions of LN. .. seealso:: Note on :ref:`faq/synchronization-of-logged-data` in the :doc:`faq` part. Command Line Arguments ---------------------- The lnrecorder is called with a required "command" argument and options which indicate topic and service names. The command required to record messages is "record". This is followed by one or more patterns for topics in the format ``"topic:[@frequency]"`` and for services in the format ``"service:"``. There, the name pattern is a regexp which matches the patterns which should be recorded. A star ``"*"`` matches all topics & services. The topics pattern can be followed by a frequency option, which is formed of an ``"@"`` followed by the frequency in Hertz. Using this option requires that the publishers of the message set the message time stamps, with a unit of seconds. A further option is ``"-v"`` for "verbose". In this mode, the LN recorder prints out any topics which have been added or removed during its run. If new topics appear which match a pattern, they are included in the recording. Each topic recording is numbered with a global counter, this means when a topic is started and stopped multiple times, it is recorded with increasing counter values. here the complete output of ``lnrecorder --help``:: usage: lnrecorder [OPTIONS...] VERB [ARGS...] VERBS can be one of: record OUTPUT_FILE [name-pattern...] start recording of matching topics and services. further arguments are intrepreted as glob patterns. in this case only topics or services with matching names are recorded. each pattern can be prefixed with 'topic:' or 'service:' to only match topics or services. prepend a '!' to exclude matching topics/patterns. each topic pattern can be followed by a '@'-sign and a subscription rate like this: 'large_topics*@3' for 3Hz subscription. example: 'record topics:* !topics:do_not_log.*' (would record all topics that do not start with 'do_not_log.') replay INPUT_FILE [name-pattern...] start replaying topics. further glob-patterns select which topics to replay, if not all. list INPUT_FILE list contents of given file idle do nothing, just wait for LN-service requests convert INPUT_FILE OUTPUT convert from one logging format to another write_msg_defs INPUT_FILE FOLDER (unfinished, output topic msg defs to stdout) OPTIONS can be: -v produce more verbose output (repeat up to 3 times) -s RELATIVE_START_TIMESTAMP (only for replay) start replaying at given timestamp this timestamp is relative to the logfile start-timestamp -vs show verbose service request/response events when they happen -name NAME use this NAME as prefix for all LN-services -z --gzip filter recorded log through gzip before writing to pcap -j --bzip2 filter recorded log through bzip2 before writing to pcap -J --xz filter recorded log through xz before writing to pcap .. only:: rmc (you can get the up-to-date output by calling ``cissy run -p links_and_nodes_runtime/[~2]@common/stable lnrecorder --help``) Example ------- .. index:: triple: user guide; logging; lnrecorder example Here is an example :file:`lnrdb.lnc` which can be found in the folder :file:`examples/guide/example_logging_lnrdb/`: .. literalinclude:: examples/guide/example_logging_lnrdb/lnrdb.lnc :language: lnc :linenos: The message definitions used are as follows: :file:`lnrdb_example/two_doubles` .. literalinclude:: examples/guide/example_logging_lnrdb/ln_msg_defs/lnrdb_example/two_doubles :linenos: :file:`lnrdb_example/two_vectors` .. literalinclude:: examples/guide/example_logging_lnrdb/ln_msg_defs/lnrdb_example/two_vectors :linenos: This example starts several processes. process "publisher" publishes some data, and the process "lnrecorder" records it. An important point is that the lnrecorder needs to be started in a shell snippet with "exec", or alternatively with the "use_execvpe" flag in the process section. This ensures that it will receive correct signals on termination, and this in turn makes sure that files with meta data are properly closed. .. note:: todo: is this still necessary with LN >= 2.1.0? Synchronization of logged Data and Time Stamps ---------------------------------------------- .. index:: triple: user guide; logging; time stamps and data synchronization pair: logging; network time protocol pair: logging; differences in length of recorded data single: time synchronization single: time stamps; of logged messages single: synchronization; of logged messages single: NTP (Network Time Protocol); when to use The lnrecorder does not ensure that data from different topic subscribers or processes are logged in a synchronized manner. It solely records any packets that were captured during its time of operation. When different subscribers, or different processes are started, they will not start operation at the same time (the difference can easily be 0.5 seconds for the subscribe operation alone). Therefore, both the original times of recorded messages, and the number of messages are expected to differ. However, the lnrecorder records the time stamps which are passed when any packets are sent. These refer to the system clock, and allow for a precise synchronization of different messages, and aligning related data. If recording from different hosts need to be related in terms of time, it must be ensured that their system clocks are synchronized via network time protocol (:term:`NTP`) or other means. Each lnrdb table for a recorded topic includes these two timestamp columns: - ``timestamp`` local system time when lnrecorder got hold of this topic message. It corresponds to the ``_packet_log_ts`` in ln_daemon recordings. - ``publisher_timestamp`` system time on publisher host when message was published. It corresponds to the ``_packet_source_ts`` in ln_daemon recordings. .. note:: the ``timestamp`` is usually of less importance: it will show more jitter, drift and probably lags behind by the time needed for transfer between hosts... .. seealso:: Note on :ref:`faq/synchronization-of-logged-data` in the :doc:`faq` part. Description of time stamp for the ``ln_read()`` function in :ref:`reference/c/ln_read`. LN Service Interface aka idle-mode ---------------------------------- when lnrecorder is started like this: ``lnrecorder idle``, it will only react to these LN-service calls: - ``.record`` using md ``lnrecorder/record`` - ``.record_waiting`` using md ``lnrecorder/record_waiting`` - ``.stop`` using md ``lnrecorder/stop`` - ``.replay`` using md ``lnrecorder/replay`` ```` defaults to ``lnrecorder`` but if started via ln-manager it will be the name fo the process definition which started the lnrecorder. when calling the ``.record``-service you need to provide these fields: - ``filename`` name of to be created lnrdb or ``.pcap``-file - ``patterns`` is a list of ``ln/string``. here you can specify exactly the same topic- / service-name-patterns as you would do on the command line (including the ``@``-rate-limiter for topics). if this list is kept empty, all topics & services will be logged! a service response with empty ``error_message`` field indicates that the lnrecorder is now recording. The ``.record`` service waits up to 10 seconds until topics that existed when recording started have received a first packet, but it still reports success if that maximum wait time expires. The ``.record_waiting`` service uses the same fields as ``.record`` and adds: - ``wait_timeout`` maximum time in seconds to wait until initially existing topics have received packets. The ``.record_waiting`` response includes ``pending_topics``, a list of initially existing topics that had not received packets when ``wait_timeout`` expired. The recording has already started; callers can decide whether a non-empty ``pending_topics`` list is an error for their use case. to stop a recording, you should call the ``.stop`` LN-service. it does not need any request fields. after the lnrecorder is again in its stopped-state, you can either start another recording or start a replay: to replay a log you call the ``.replay`` LN-service and provide these request fields: - ``filename`` is the name of the lnrdb or ``.pcap``-file - ``patterns`` is a list of ``ln/string`` and is optional. use it to only replay a selection of the recorded topics / services. if not specified, all topics / services are replayed. - ``start_time`` start replaying from a given time-offset (seconds) into the recording. use ``0`` to start from the beginning. - ``stop_time`` stop playback at given time in seconds after recording-start. set to ``0`` to play until end of recording. - ``wait`` optional. if set to ``1`` the service call will wait until the replay is finished. otherwise the service call will return as soon as the replay was started a replay can also be stopped via the .stop LN-service from above. Data Accesss ------------ .. index:: triple: user guide; logging; accessing data from lnrecorder Reading logged Data in LNRDB Format into Python ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The process "python example read log" reads the generated log data, and prints some values. It uses the "lnrdb" python module from the links_and_nodes_python package, ands its listing looks like this: .. literalinclude:: examples/guide/example_logging_lnrdb/read_log.py :language: python :linenos: An important aspect of the lnrdb library is its capability to return memory-mapped numpy arrays. This means that it is possible to access time series which are far larger than a size that would fit into memory, but in the same way as a "normal" memory-backed numpy array. Data is read from disk only on-demand and can also be modified. .. index:: triple: user guide; logging; convert data into hdf5 format pair: hdf5 format; logging Converting Logged Data to HDF5 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The process "convert log" reads the LNRDB binary data and saves it in :term:`HDF5` format. HDF5 is a scientific data format that was developed for time series of large amounts of geophysical data, for example. One advantage it has is that it can hold any relevant metadata, and also that it can be easily converted into other formats. The converter script shown here takes at least two arguments: The first is the hdf5 file which is to be generated as output, and the following ones are the lnrdb recordings which should be included in the hdf5 file. That file acts like a kind of archive and container and can hold multiple time series as well as their meta data. .. literalinclude:: examples/guide/example_logging_lnrdb/convert_logs.py :language: python :linenos: Once the conversion is done, no dependencies on LN or the real-time run-time system are needed, which means that the data can be analysed off-line if desired. there is also a ready to use conan-packaged stand-alone converter: ``lnrdb_to_hdf5_converter/1.0.0@common/unstable`` for details check here https://rmc-github.robotic.dlr.de/common/lnrdb_to_hdf5_converter Reading HDF5 Data ~~~~~~~~~~~~~~~~~ As with the lnrdb module, such HDF5 files can be used to access memory-mapped binary data. The script "read_hdf5.py" shows how this can be done: .. literalinclude:: examples/guide/example_logging_lnrdb/read_hdf5.py :language: python :linenos: .. index:: triple: user guide; logging; reading lnrecorder data into Matlab Reading HDF5 into Matlab ~~~~~~~~~~~~~~~~~~~~~~~~ The script in :file:`matlab/example_hdf5.m` shows how a :term:`HDF5` file can be accessed in matlab. .. literalinclude:: examples/guide/example_logging_lnrdb/matlab/example_hdf5.m :language: matlab :linenos: Here, the hdf5 reader routine returns matlab matrix objects. In difference to Numpy, these objects are not memory-mapped. This means that if the time series is very large, only a selection of it can be read into memory at the same time (Matlab is capable to access memory-mapped data, but only with low-level routines). Direct Access to LNRDB Data in Matlab ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As a last example, we show some scripts which read binary LNRDB data directly in matlab: .. literalinclude:: examples/guide/example_logging_lnrdb/matlab/example.m :language: matlab :linenos: .. literalinclude:: examples/guide/example_logging_lnrdb/matlab/read_topic1.m :language: matlab :linenos: .. literalinclude:: examples/guide/example_logging_lnrdb/matlab/read_topic2.m :language: matlab :linenos: The offsets need to be computed from the message definitions. As we see, this approach is more fragile, since it needs to use manually computed direct offsets into the binary files, which depend on the message definitions. Any change to the message definitions will render these offsets invalid, requiring consistent updates to avoid reading invalid binary data. .. only:: rmc .. tip:: A repository with this full example in runnable form is available under https://rmc-github.robotic.dlr.de/schm-fl/lnrdb_example/tree/convert-hdf5 (branch convert-hdf5) .. only:: rmc Using the LN recorder with Cissy ================================ If you want to use the LN recorder with cissy, this cissy workspace file and LN configuration can be used as a template. Both together ensure that the right minimum version of lnrecorder and lnrdb python module are used: :file:`cissy_workspace.yaml` .. literalinclude:: examples/guide/example_logging_lnrdb/cissy/cissy_workspace.yaml :language: yaml :linenos: :file:`lnrdb_cissy.lnc` .. literalinclude:: examples/guide/example_logging_lnrdb/cissy/lnrdb_cissy.lnc :language: lnc :linenos: Troubleshooting ~~~~~~~~~~~~~~~ .. index:: triple: user guide; logging; troubleshooting missing metadata If any metadata files appear to be missing, check that the right LN version >= 2.1.0 is used, and that the lnrecorder program is started either with the "exec" shell command, or with the "execvpe" flag in the LNM configuration for the process! The LN Daemon Logger ==================== .. index:: triple: user guide; logging; example LN daemon logger The LN daemon logger can be accessed by the LN client API, which allows to instruct an LN daemon to log message data to an `circular buffer `_ in RAM. It is run from an LN client, and instructs the daemon on a given node to log named topics. Here is an example script how to use it: .. literalinclude:: examples/guide/example_logging_daemon_logger/lndaemon_logger.py :language: python :linenos: Here, ``logger.add_topic()`` configures logging for two topics: ``test.topic1`` and ``test.topic2``, for each topic the last 100000 samples will be kept in RAM. ``logger.enable()`` starts logging, and ``logger.disable()`` stops it. ``logger.manager_save(filename)`` instructs the LN manager to fetch the collected data from the daemons and save it to the given file. The default is to log the "source shared memory" on the host the publisher runs on. Each shared-memory can only have one logger running at a time. If you have a subscriber on a different host or a subscriber with ``subscriber-rate != -1`` they will get their own shared-memory. If you want to log data that arrives there, you need to specify a topic-name in the form of ``.add_topic("subscriber:[@]:)`` where ```` is the unique name of the ln-client doing that subscription (check the ln_manager's "clients"-tab). specifying a ```` is only useful if that client does multiple subscriptions of the same topic (not many use-cases). In the example, the file name terminaes with "pcap", so that PCAP format is used. The logger can also save in Matlab's ``.mat`` format or python's ``.pickle``. At the time being, recording of complex data structures is probably done best with the LN recorder with its "lnrdb" format that can be converted into :term:`HDF5` if needed. .. index:: triple: user guide; logging; comparison lnrecorder vs ln daemon logger LN Recorder and Daemon Logger: When to use Which One ==================================================== ============================= ================================ ======================================= aspect LN recorder Daemon logger ============================= ================================ ======================================= max duration huge; limited by file system limited by ram of publisher-node max data rate limited by network very fast; limited by shared memory suitable for complex data yes less stores meta data yes no supported file formats LNRDB, :term:`HDF5`, PCAP PICKLE, PCAP, MAT, RAW memory-mapped view avilable yes no (could be done with RAW format) typical usage recording motions, I/O, workflow debugging hardware drivers, controllers ============================= ================================ ======================================= .. only:: rmc .. tip:: Another option for logging is the "monitor" package, which provides logging of status messages that is similar to syslog message recording and transmission.