KbA0004: Data storage capacity of CopperCube

Description

I am curious about Coppercube’s internal database storage. I want assurance that the device can hold the number of TLs I plan to collect for the time span I require.

Solution

Each Trend log is stored as a series of ‘day buckets’ – where each bucket contains the trend samples for a given trend log for a given day, along with some meta-data that describes the date of the samples, the number of samples, the device, and TL from whence the samples were taken. This design choice offers a good trade-off between ‘storage density’; and ‘query efficiency’ for the kind of data & queries the Coppercube is expected to support. The design also gracefully handles the fact that trend logs vary widely in the number of samples they record each day. A binary COV (change-of-value) trend log monitoring a user setpoint is very different from an input status trend log polling every minute; from a power meter trending energy consumption every 15 minutes.

Day Buckets

The Coppercube uses a non-SQL database (MongoDB) to store trend log data, as it is better suited to handling the variable-sized day buckets than a traditional SQL database. The structure of the individual day buckets is:

Day bucket:  
   _id: <sitename.100.1.datestamp>    # string (16-N bytes)  typical 32 chars
   l: <sitename>             # string (1-N bytes) typical 10 chars
   d: <device_id>            # int (4 bytes)
   t: <trend_id>             # int (4 bytes)
   dt: <base_date>           # timestamp in epoch (4 bytes)
   c: <sample_count>         # int (4 bytes)
   dts: <timestamp>          # Datestamp (8 bytes)
   fs: <1st sequence #>      # int (4 bytes)
   ls: <last sequence #>     # int (4 bytes)

   s: [ <samples>, ]         # array of sample tuples (0..N)  where each sample is a (9-12 byte) tuple consisting of:
       t: <int 4 bytes>         #  bits (0-3): sample type
                                #  bits (4-7): sample size (bytes)
                                #  bits (8-31):  | timestamp (hundredths of seconds since Midnight = (0..8640000)
       seq : <int 4 bytes>   # Sequence number
       data: <1-4 bytes>     # actual sample value; length depending on sample type (listed below)
}

Sample types:
0x00 => LogStatus    (Stored in 1 byte) 
0x01 => Boolean      (Stored in 1 byte) 
0x02 => Real         (4 extra bytes) 
0x03 => Enum         (Stored in 1 byte, values 0-254, 255 is reserved for special use)
0x04 => Unsigned     (4 extra bytes) 
0x05 => Signed       (4 extra bytes) 
0x06 => BitString    (2/4/6 extra bytes)
0x07 => Null         (Stored in 1 byte)
0x08 => Error Class  (4 extra bytes)
0x09 => TimeChange   (4 extra bytes)
*note: 8 byte doubles not supported.

Capacity Calculations

The Coppercube has a stated capacity of 5000 TLs for 5 years.

How was this derived? More importantly, How do we ensure this can be achieved? We know the following:

  • the number of samples in a day for a TL is a direct function of the TL sampling frequency.
  • the Coppercube supports TLs polling from once a minute to once a year and COV TLs (with bursts to 1-sec resolution).
  • an analog value in BACnet is a 4-byte floating-point number, and a BACnet binary value is a 1-byte boolean value.
  • there is an overhead of 8 bytes for every sample.
  • there is an overhead of at least 76 bytes for an everyday bucket (more with longer site names). Let’s assume 100 bytes.
    • note: overhead for an 8-character sitename= 104 bytes. For a 17-character sitename= 144 bytes.
  • TLs that cannot be reliably collected (due to configuration issues) are automatically excluded.

Thus we can calculate the following day bucket storage sizes:

Frequency Samples
per day
Sample Type
& Size
Overhead
(bytes)
Total
(bytes)
1 minute 1440 Binary(9 bytes) 100 13060
  1440 Analog(12 bytes) 100 17380
5 minute 288 Binary(9 bytes) 100 2692
  288 Analog(12 bytes) 100 3556
15 minute 96 Binary(9 bytes) 100 964
  96 Analog(12 bytes) 100 1252
hourly 24 Binary(9 bytes) 100 316
  24 Analog(12 bytes) 100 388
COV unknown* either 100 ?*

*note: if a COV generates more than 1440 samples a day it is wise to switch it into a 1 minute polled TL. *note: COV TLs are very CPU expensive for a controller and unless you are saving significant space, you are paying all the costs and getting none of their benefits.

From these sizes we can calculate the space required for each type of TL:

Frequency Type Size – 1year Size – 5 years
1 minute Binary 4.54MB 22.73MB
  Analog 6.0MB 30.25MB
5 minute Binary 0.94MB 4.69MB
  Analog 1.24MB 6.19MB
15 minute Binary 343.6K 1.68MB
  Analog 446.3K 2.18MB
hourly Binary 112.6K 0.55MB
  Analog 138.3K 0.68MB

If we know the mix of TLs residing on a set of buildings, we can calculate the storage space required and then choose an appropriate number of Coppercubes to handle the load. A Coppercube has 60GB of storage. 1/3 is reserved for internal use (OS, software, working space). This leaves 40GB for TL storage. It is important you carefully consider exactly which TLs you should collect, and then configure those TLs appropriately to meet your analysis requirements.

Assuming a realistic:

  • TL mixture of 75% Analog TLs & 25% Binary TLs, with a
  • Polling frequency mixture of (5%-1min / 70%-5min / 20%-15min / 5%-hourly)

we can calculate the storage required to save 5 years of data:

TL Polling
Frequency
Analog TLs
(75%)
Required
Storage
Binary TLs
(25%)
Required
Storage
1 minute 5% of 3750= 188 5.55GB 5% of 1250 = 63 1.39GB
5 minute 70% of 3750 = 2625 15.87GB 70% of 1250 = 875 4.01GB
15 minute 20% of 3750 = 750 1.60GB 20% of 1250 = 250 0.41GB
Hourly 5% of 3750= 188 0.12GB 5% of 1250 = 63 0.03GB
  Total 23.14GB Total 5.84GB
Grand Total = 28.98 or 72% of allotted space

Storage Time-frame

The Coppercube is designed for long-term (5-year) trend log storage, with an auto-pruning feature to ensure new data will not be lost due to old data consuming all the available storage space. If you require permanent archival of trend log data, both the Kaizen Vault (cloud-based storage) and the Coppercube SQL-Connector (optional feature) provide convenient methods to continuously send trend log data to additional storage locations.

More Information

The Inverse – How much could it hold?

Saying the Coppercube can hold 5000 TLs for 5 years, is all-fine-&-dandy, but I like big numbers. How many samples can it hold?

How many samples a Coppercube can hold is the inverse of the original question addressed by this article. Using the same knowledge about sample sizes, sampling frequencies, and conservative derating of the device – i.e. allocating only 40GB to TL storage – we can calculate two ends of the sampling spectrum: a 1 min TL vs. a 1 Hour TL:

The fastest TL

  • a 1 min Analog TL is = 1440 samples/day * 365.25 days/year = 525960 samples/yr.
  • the TL consumes 6.0MB/yr of storage space.
  • so the Coppercube’s 40GB could hold:
    • 40GB = 40960MB / 6.0MB = 6826.6 TLyears * 525960 samples/TLyear = 3,590,553,600 samples
    • = 3.6 billion ‘1 min analog’ samples.
  • note: this TL has the best sample-to-overhead ratio

The slow TL

  • a 1 hr Analog TL is = 24 samples/day * 365.25 days/yr = 8766 samples/yr.
  • the TL consumes 138.3K/yr of storage space.
  • so the Coppercube’s 40GB could hold:
    • 40GB = 40960MB / 0.14MB = 292571.4 TLyears * 8766 samples/TLyear = 2,564,680,892 samples
    • = 2.5 billion ‘hourly analog’ Samples.
  • note: the total sample count is lower due to the lower sample-to-overhead ratio that occurs with all slower frequencies.

So your answer is: a Coppercube could hold ~3 billion samples [+/- a few 100-million].