Metrics
Prometheus formatted metrics endpoints can be configured for each container (including the OpenRemote Manager), you will either need Prometheus server running to scrape these endpoints or use a cloud provider service; here's an example using AWS Cloudwatch:
Refer to the website of each container app for details of metrics exposed and their meaning; here's an overview of the OpenRemote Manager metrics.
OpenRemote Manager
| Component | Metric name | Type | Labels | Description |
|---|---|---|---|---|
| Artemis | artemis_active | gauge | broker: localhost | If the server is active |
| Artemis | artemis_address_memory_usage | gauge | broker: localhost | Memory used by all the addresses on broker for in-memory messages |
| Artemis | artemis_address_memory_usage_percentage | gauge | broker: localhost | Memory used by all the addresses on broker as a percentage of the global-max-size |
| Artemis | artemis_address_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost | The number of estimated bytes being used by all the queue(s) bound to this address; used to control paging and blocking |
| Artemis | artemis_authentication_count | gauge | broker: localhost result: failure | success | Number of successful authentication attempts |
| Artemis | artemis_authorization_count | gauge | broker: localhost result: failure | success | Number of successful authorization attempts |
| Artemis | artemis_connection_count | gauge | broker: localhost | Number of clients connected to this server |
| Artemis | artemis_consumer_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of consumers consuming messages from this queue |
| Artemis | artemis_delivering_durable_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of durable messages that this queue is currently delivering to its consumers |
| Artemis | artemis_delivering_durable_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of durable messages that this queue is currently delivering to its consumers |
| Artemis | artemis_delivering_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages that this queue is currently delivering to its consumers |
| Artemis | artemis_delivering_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of messages that this queue is currently delivering to its consumers |
| Artemis | artemis_disk_store_usage | gauge | broker: localhost | Fraction of total disk store used |
| Artemis | artemis_durable_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of durable messages currently in this queue (includes scheduled, paged, and in-delivery messages) |
| Artemis | artemis_durable_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of durable messages currently in this queue (includes scheduled, paged, and in-delivery messages) |
| Artemis | artemis_limit_percent | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost | The % of memory limit (global or local) that is in use by this address |
| Artemis | artemis_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages currently in this queue (includes scheduled, paged, and in-delivery messages) |
| Artemis | artemis_messages_acknowledged | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages acknowledged from this queue since it was created |
| Artemis | artemis_messages_added | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages added to this queue since it was created |
| Artemis | artemis_messages_expired | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages expired from this queue since it was created |
| Artemis | artemis_messages_killed | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of messages removed from this queue since it was created due to exceeding the max delivery attempts |
| Artemis | artemis_number_of_pages | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost | Number of pages used by this address |
| Artemis | artemis_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of all messages (including durable and non-durable) currently in this queue (includes scheduled, paged, and in-delivery messages) |
| Artemis | artemis_replica_sync | gauge | broker: localhost | If the initial replication synchronization process is complete |
| Artemis | artemis_routed_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost | Number of messages routed to one or more bindings |
| Artemis | artemis_scheduled_durable_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of durable scheduled messages in this queue |
| Artemis | artemis_scheduled_durable_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of durable scheduled messages in this queue |
| Artemis | artemis_scheduled_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Number of scheduled messages in this queue |
| Artemis | artemis_scheduled_persistent_size | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost queue: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request | Persistent size of scheduled messages in this queue |
| Artemis | artemis_session_count | gauge | broker: localhost | Number of sessions on this server |
| Artemis | artemis_total_connection_count | gauge | broker: localhost | Total number of clients which have connected to this server since it was started |
| Artemis | artemis_total_session_count | gauge | broker: localhost | Total number of sessions created on this server since it was started |
| Artemis | artemis_unrouted_message_count | gauge | address: *.*.writeattribute.# | *.*.writeattributevalue.# | provisioning.*.request broker: localhost | Number of messages not routed to any bindings |
| Executors | executor_active_threads | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The approximate number of threads that are actively executing tasks |
| Executors | executor_completed_tasks_total | counter | name: ContainerExecutor | ContainerScheduledExecutor | The approximate total number of tasks that have completed execution |
| Executors | executor_idle_seconds | summary | name: ContainerExecutor | ContainerScheduledExecutor | Idle time of executor |
| Executors | executor_idle_seconds_max | gauge | name: ContainerExecutor | ContainerScheduledExecutor | Maximum idle time of executor |
| Executors | executor_pool_core_threads | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The core number of threads for the pool |
| Executors | executor_pool_max_threads | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The maximum allowed number of threads in the pool |
| Executors | executor_pool_size_threads | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The current number of threads in the pool |
| Executors | executor_queue_remaining_tasks | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The number of additional elements that this queue can ideally accept without blocking |
| Executors | executor_queued_tasks | gauge | name: ContainerExecutor | ContainerScheduledExecutor | The approximate number of tasks that are queued for execution |
| Executors | executor_scheduled_once_total | counter | name: ContainerExecutor | Total tasks scheduled once |
| Executors | executor_scheduled_repetitively_total | counter | name: ContainerScheduledExecutor | Total tasks scheduled repetitively |
| Executors | executor_seconds | summary | name: ContainerExecutor | ContainerScheduledExecutor | Measures executor task execution time |
| Executors | executor_seconds_max | gauge | name: ContainerExecutor | ContainerScheduledExecutor | Maximum execution time of executor tasks |
| Events | or_attributes_total | counter | source: AgentService | AttributeLinkingService | EnergyOptimisationService | GatewayService | RulesEngine | none | ... | Total attributes processed by source |
| Events | or_attributes_seconds | summary | (none) | Total time spent processing attribute events |
| Events | or_attributes_seconds_max | gauge | (none) | Maximum time spent processing an attribute event |
| Events | or_provisioning_seconds | summary | (none) | Total time spent processing provisioning requests |
| Events | or_provisioning_seconds_max | gauge | (none) | Maximum time spent processing provisioning requests |
| Rules | or_rules_seconds | summary | (none) | Total time spent processing rules |
| Rules | or_rules_seconds_max | gauge | (none) | Maximum time spent processing rules |
PostgreSQL (via Query Exporter)
The following metrics are exposed by the Query Exporter, which connects directly to the OpenRemote PostgreSQL database to monitor TimescaleDB performance, connection limits, and general database health. The
following is based on the default configuration found in /deployment/query-exporter/config.yaml.
| Metric name | Type | Labels | Description |
|---|---|---|---|
| pg_collation_mismatch_count | gauge | (none) | Number of text indexes with collation version mismatches requiring a REINDEX |
| pg_cache_hit_percentage | gauge | (none) | What percentage of data is being served instantly from RAM versus being slowly read from disk. You want this as high as possible |
| pg_connections_limit | gauge | (none) | Count of connections max limit |
| pg_connections_used | gauge | (none) | Count of connections in use |
| pg_connections_free | gauge | (none) | Count of connections available |
| pg_connections_stuck | gauge | (none) | Count of connections with state of idle in transaction |
| pg_hot_update_percent | gauge | table_name | Table percentage of updates that are HOT updates indicates good fillfactor |
| pg_dead_tuple_percent | gauge | table_name | Table ratio of dead tuples to live ones a ratio > 10-20% indicates not aggressive enough autovacuum |
| pg_last_autovacuum_hours | gauge | table_name | Table hours since last auto vacuum run successfully |
| pg_last_autoanalyze_hours | gauge | table_name | Table hours since last auto analyze run successfully |
| pg_db_disk_size | gauge | (none) | DB size in MB |
| pg_datapoint_raw_data_size | gauge | (none) | Asset datapoint table raw uncompressed size in MB |
| pg_datapoint_indexes_size | gauge | (none) | Asset datapoint table indexes size in MB |
| pg_datapoint_toast_size | gauge | (none) | Asset datapoint TOAST table size in MB |
| pg_datapoint_disk_size | gauge | (none) | Asset datapoint table size in MB |
| pg_datapoint_chunk_count | gauge | (none) | Asset datapoint table hypertable chunk count |
| pg_datapoint_uncompressed_chunk_count | gauge | (none) | Asset datapoint table hypertable uncompressed chunk count |
| pg_datapoint_chunks_needing_compression | gauge | (none) | Asset datapoint table hypertable chunks needing compression count |
| pg_datapoint_chunk_start_weeks | gauge | (none) | Asset datapoint table oldest hypertable chunk in weeks |
| pg_datapoint_chunk_end_weeks | gauge | (none) | Asset datapoint table newest hypertable chunk in weeks |
| pg_datapoint_chunks_not_analyzed | gauge | (none) | Asset datapoint table hypertable chunks not yet analyzed |
| pg_datapoint_largest_uncompressed_chunk | gauge | (none) | Asset datapoint table largest uncompressed hypertable chunk in MB |
| pg_datapoint_uncompressed_cache_hit_ratio | gauge | (none) | Asset datapoint table cache hit ratio for uncompressed chunks (Aim for 99%+) |
| pg_datapoint_uncompressed_blks_read_total | counter | (none) | Asset datapoint total physical disk blocks read for uncompressed chunks (Monitor rate with spikes indicate RAM spillover) |
| pg_datapoint_compression_ratio | gauge | (none) | Asset datapoint table compression ratio |
| pg_datapoint_query | gauge | (none) | Dummy metric to get typical query time metric |
| pg_background_errors | counter | (none) | Count of errors in background worker processes |
| pg_timescale_job_total_runs | counter | job_id | proc_name | TimescaleDB job total runs by job |
| pg_timescale_job_total_failures | counter | job_id | proc_name | TimescaleDB job total failures by job |
| pg_timescale_job_last_run_duration_seconds | gauge | job_id | proc_name | TimescaleDB job last run duration in seconds |
| pg_timescale_job_next_start_seconds | gauge | job_id | proc_name | Seconds until next scheduled run for each TimescaleDB job |
| pg_timescale_job_last_run_status | gauge | job_id | proc_name | last_run_status | TimescaleDB job last run status marker |
| pg_wal_total | counter | (none) | Total WAL written since statistics reset in MB |
| pg_bgwriter_checkpoints_timed_total | counter | (none) | Scheduled checkpoints executed |
| pg_bgwriter_checkpoints_req_total | counter | (none) | Requested checkpoints executed |
| pg_bgwriter_checkpoint_write_time_seconds_total | counter | (none) | Total time spent writing checkpoints in seconds |
| pg_bgwriter_checkpoint_sync_time_seconds_total | counter | (none) | Total time spent syncing checkpoints in seconds |
| pg_table_bloat_count | gauge | (none) | Number of tables where dead tuples > 30% of live rows |
| pg_index_bloat_count | gauge | (none) | Number of indexes that are larger than 150% of table size |
| pg_long_running_queries_count | gauge | (none) | Number of queries that have been running for longer than 30 seconds |
| pg_longest_query_duration_seconds | gauge | (none) | Duration of the longest running query in seconds |