Savvy DIAdem Solutions - BLOG

Home Navigator View Analysis Report DAC Script Dialog Data Apps Blog

 

NI DIAdem Vs. Time-Series Database

NI DIAdem is able to ingest a CSV text file with 87,264,000 random IEEE-754 64-bit floating-point values at a rate of 722,982 values/s, or 53.8 GiB/hr.   Subsequent reading of the TDMS file after it is created takes 0.1 sec or 1.36E+09 values/s, and writing any changes takes 3.0 sec or 2.86E+07 values/s.  

NI DIAdem data ingestion performance was measured to be at least 27x faster than the following online time series databases: InfluxDB, Cassendra, Elasticsearch, MongoDB, OpenTSDB, Graphite, and Splunk.   The comparison is based on using the same number of cores.  

NI DIAdem is software for the management, storage, analysis and visualization of data acquisition data.   It is optimized to support the ingestion, storage, and analysis of test data from a wide variety of time series measurement data file formats.  

InfluxDB is an open-source time series database (TSDB) for storage and retrieval of time series data.   The TSDB is hosted in the cloud and tools are available online for data analysis and visualization.  

  • NI DIAdem was able to ingest the data 27x faster than InfluxDB, despite the modest hardware used for NI DIAdem performance test.   (Laptop with AMD Ryzen 7 5700U CPU (8 cores), 16 GB RAM, a SSD, running NI DIAdem v2022 and Windows 11 OS).   This comparison is based on the InfluxDB per server ingestion performance of 26,749 values/s.  
  • The NI DIAdem data set contained only floating point values.   A mix of other data types (string, boolean, integer, etc) could have been used, and probably would have resulted in even better ingestion performance.   Importing full precision floating point values was considered the most challenging case for NI DIAdem.  
  • NI DIAdem can ingest binary and compressed data at a significantly higher rate than text / CSV files.  
  • InfluxDB can match and exceed NI DIAdem's ingestion performance when 100 servers are employed.   However, the InfluxDB servers must be in close proximity to the source data and with a very high speed data connection in order to achieve the results reported in the InfluxDB test.  
  • Both InfluxDB and NI DIAdem will experience significant data transfer delays if the source data is remotely pushed from slower data connections such as cellular, WiFi, internet, etc.  
  • InfluxDB performance was significantly better than the other time series databases of: Cassendra (5x faster), Elasticsearch (3.8x faster), MongoDB (1.9x faster), OpenTSDB (5x faster), Graphite (14x faster), and Splunk (17x faster).   Since

 

InfluxDB Performance Comparisons

As of October 2022, InfluxDB has whitepapers posted on their website demonstrating faster write (data ingestion) performance than the following competitive time series databases: Cassendra (5x faster), Elasticsearch (3.8x faster), MongoDB (1.9x faster), OpenTSDB (5x faster), Graphite (14x faster), and Splunk (17x faster).   The comparison characterizes the write performance (data ingestion) in terms of values per second.   The values vary in data type and randomly by value, resulting in a random data package size.   This makes it easy to assess the data ingestion performance for a variety of applications, but difficult to convert that ingestion to bytes/second/server for comparison to other system outside of those evaluated in their evaluation.   In all of the comparisons, 100 servers were used concurrently to process 87,264,000 values and the performance was measured over the ingestion of 100 values.  

In the comparison of InfluxDB to Elasticsearch, the whitepaper claims the average write (ingestion) throughput of InfluxDB was 2,674,948 values per second utilizing 100 servers (or 26,749 values/s per server).   In the comparison of InfluxDB to MongoDB, the whitepaper sites that the write or injestion performance of InfluxDB was 2,644,765 values per second utilizing 100 servers (or 26,447 values/s per server).  

NI DIAdem Ingestion Performance

In InfluxDB's performance comparison to MongoDB, a total of 87,264,000 values were created in the test data set, and then the performance was measured as 100 values were ingested.   This type of measurement is difficult to replicate in DIAdem without adversely affecting the ingestion process.  

I wrote a NI DIAdem script to create a CSV text file with 87,264,000 values consisting of 872,264 lines, with each line containing a Unix timestamp with nanosecond precision, followed by 100 random IEEE-754 64-bit floating-point numbers, all deliminated by a semicolon (e.g. 1676996959000000000;3.94714746398025E+299;4.55576917883636E+299;..).   It took NI DIAdem 120.7 sec to read the 1.80 Gb uncompressed text file and write it to a new TDMS binary file, or an average ingestion rate of 722,982 values/s (87,264,000 values / 120.7 sec ), or 53.8 GiB/hr.   Reading the TDMS file after it is created takes 0.1 sec or 1.36E+09 values/s, and writing any changes takes only 3.0 sec or 2.86E+07 values/s.  

NI DIAdem was able to ingest the data 27 times faster than InfluxDB, relative to the InfluxDB per server rate of 26,447 values/s per server.  

Source Data Proximity to Ingestion Application

It is critical to recognize that all of the above tests were conducted with a close proximity between the source data and the application (NI DIADem or a time-series database), and that a high speed data connection existed between the source and the ingestion application.  

 

Do you need help with your project?   Send me an email requesting a free phone / web share consultation.  


 

Copyright © 2021,2022,2023 Mechatronic Solutions LLC, All Rights Reserved