KbA0004: Data storage capacity of CopperCube

Description

I am curious about Coppercube’s internal database storage. I want assurance that the device can hold the number of TLs I plan to collect for the time span I require.

Solution

Each Trend log is stored as a series of ‘day buckets’ – where each bucket contains the trend samples for a given trend log for a given day, along with some meta-data that describes the date of the samples, the number of samples, the device, and TL from whence the samples were taken. This design choice offers a good trade-off between ‘storage density’; and ‘query efficiency’ for the kind of data & queries the Coppercube is expected to support. The design also gracefully handles the fact that trend logs vary widely in the number of samples they record each day. A binary COV (change-of-value) trend log monitoring a user setpoint is very different from an input status trend log polling every minute; from a power meter trending energy consumption every 15 minutes.

Day Buckets

The Coppercube uses a non-SQL database (MongoDB) to store trend log data, as it is better suited to handling the variable-sized day buckets than a traditional SQL database. The structure of the individual day buckets is:

Day bucket:  
   _id: <sitename.100.1.datestamp>    # string (16-N bytes)  typical 32 chars
   l: <sitename>             # string (1-N bytes) typical 10 chars
   d: <device_id>            # int (4 bytes)
   t: <trend_id>             # int (4 bytes)
   dt: <base_date>           # timestamp in epoch (4 bytes)
   c: <sample_count>         # int (4 bytes)
   dts: <timestamp>          # Datestamp (8 bytes)
   fs: <1st sequence #>      # int (4 bytes)
   ls: <last sequence #>     # int (4 bytes)

   s: [ <samples>, ]         # array of sample tuples (0..N)  where each sample is a (9-12 byte) tuple consisting of:
       t: <int 4 bytes>         #  bits (0-3): sample type
                                #  bits (4-7): sample size (bytes)
                                #  bits (8-31):  | timestamp (hundredths of seconds since Midnight = (0..8640000)
       seq : <int 4 bytes>   # Sequence number
       data: <1-4 bytes>     # actual sample value; length depending on sample type (listed below)
}

Sample types:
0x00 => LogStatus    (Stored in 1 byte) 
0x01 => Boolean      (Stored in 1 byte) 
0x02 => Real         (4 extra bytes) 
0x03 => Enum         (Stored in 1 byte, values 0-254, 255 is reserved for special use)
0x04 => Unsigned     (4 extra bytes) 
0x05 => Signed       (4 extra bytes) 
0x06 => BitString    (2/4/6 extra bytes)
0x07 => Null         (Stored in 1 byte)
0x08 => Error Class  (4 extra bytes)
0x09 => TimeChange   (4 extra bytes)
*note: 8 byte doubles not supported.

Capacity Calculations

The Coppercube has a stated capacity of 5000 TLs for 5 years.

How was this derived? More importantly, How do we ensure this can be achieved? We know the following:

  • the number of samples in a day for a TL is a direct function of the TL sampling frequency.
  • the Coppercube supports TLs polling from once a minute to once a year and COV TLs (with bursts to 1-sec resolution).
  • an analog value in BACnet is a 4-byte floating-point number, and a BACnet binary value is a 1-byte boolean value.
  • there is an overhead of 8 bytes for every sample.
  • there is an overhead of at least 76 bytes for an everyday bucket (more with longer site names). Let’s assume 100 bytes.
    • note: overhead for an 8-character sitename= 104 bytes. For a 17-character sitename= 144 bytes.
  • TLs that cannot be reliably collected (due to configuration issues) are automatically excluded.

Thus we can calculate the following day bucket storage sizes:

FrequencySamples
per day
Sample Type
& Size
Overhead
(bytes)
Total
(bytes)
1 minute1440Binary(9 bytes)10013060
 1440Analog(12 bytes)10017380
5 minute288Binary(9 bytes)1002692
 288Analog(12 bytes)1003556
15 minute96Binary(9 bytes)100964
 96Analog(12 bytes)1001252
hourly24Binary(9 bytes)100316
 24Analog(12 bytes)100388
COVunknown*either100?*

*note: if a COV generates more than 1440 samples a day it is wise to switch it into a 1 minute polled TL. *note: COV TLs are very CPU expensive for a controller and unless you are saving significant space, you are paying all the costs and getting none of their benefits.

From these sizes we can calculate the space required for each type of TL:

FrequencyTypeSize – 1yearSize – 5 years
1 minuteBinary4.54MB22.73MB
 Analog6.0MB30.25MB
5 minuteBinary0.94MB4.69MB
 Analog1.24MB6.19MB
15 minuteBinary343.6K1.68MB
 Analog446.3K2.18MB
hourlyBinary112.6K0.55MB
 Analog138.3K0.68MB

If we know the mix of TLs residing on a set of buildings, we can calculate the storage space required and then choose an appropriate number of Coppercubes to handle the load. A Coppercube has 60GB of storage. 1/3 is reserved for internal use (OS, software, working space). This leaves 40GB for TL storage. It is important you carefully consider exactly which TLs you should collect, and then configure those TLs appropriately to meet your analysis requirements.

Assuming a realistic:

  • TL mixture of 75% Analog TLs & 25% Binary TLs, with a
  • Polling frequency mixture of (5%-1min / 70%-5min / 20%-15min / 5%-hourly)

we can calculate the storage required to save 5 years of data:

TL Polling
Frequency
Analog TLs
(75%)
Required
Storage
Binary TLs
(25%)
Required
Storage
1 minute5% of 3750= 1885.55GB5% of 1250 = 631.39GB
5 minute70% of 3750 = 262515.87GB70% of 1250 = 8754.01GB
15 minute20% of 3750 = 7501.60GB20% of 1250 = 2500.41GB
Hourly5% of 3750= 1880.12GB5% of 1250 = 630.03GB
 Total23.14GBTotal5.84GB
Grand Total = 28.98 or 72% of allotted space

Storage Time-frame

The Coppercube is designed for long-term (5-year) trend log storage, with an auto-pruning feature to ensure new data will not be lost due to old data consuming all the available storage space. If you require permanent archival of trend log data, both the Kaizen Vault (cloud-based storage) and the Coppercube SQL-Connector (optional feature) provide convenient methods to continuously send trend log data to additional storage locations.

More Information

The Inverse – How much could it hold?

Saying the Coppercube can hold 5000 TLs for 5 years, is all-fine-&-dandy, but I like big numbers. How many samples can it hold?

How many samples a Coppercube can hold is the inverse of the original question addressed by this article. Using the same knowledge about sample sizes, sampling frequencies, and conservative derating of the device – i.e. allocating only 40GB to TL storage – we can calculate two ends of the sampling spectrum: a 1 min TL vs. a 1 Hour TL:

The fastest TL

  • a 1 min Analog TL is = 1440 samples/day * 365.25 days/year = 525960 samples/yr.
  • the TL consumes 6.0MB/yr of storage space.
  • so the Coppercube’s 40GB could hold:
    • 40GB = 40960MB / 6.0MB = 6826.6 TLyears * 525960 samples/TLyear = 3,590,553,600 samples
    • = 3.6 billion ‘1 min analog’ samples.
  • note: this TL has the best sample-to-overhead ratio

The slow TL

  • a 1 hr Analog TL is = 24 samples/day * 365.25 days/yr = 8766 samples/yr.
  • the TL consumes 138.3K/yr of storage space.
  • so the Coppercube’s 40GB could hold:
    • 40GB = 40960MB / 0.14MB = 292571.4 TLyears * 8766 samples/TLyear = 2,564,680,892 samples
    • = 2.5 billion ‘hourly analog’ Samples.
  • note: the total sample count is lower due to the lower sample-to-overhead ratio that occurs with all slower frequencies.

So your answer is: a Coppercube could hold ~3 billion samples [+/- a few 100-million].