123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285 |
- How to monitor Synapse metrics using Prometheus
- ===============================================
- 1. Install Prometheus:
- Follow instructions at http://prometheus.io/docs/introduction/install/
- 2. Enable Synapse metrics:
- There are two methods of enabling metrics in Synapse.
- The first serves the metrics as a part of the usual web server and can be
- enabled by adding the "metrics" resource to the existing listener as such::
- resources:
- - names:
- - client
- - metrics
- This provides a simple way of adding metrics to your Synapse installation,
- and serves under ``/_synapse/metrics``. If you do not wish your metrics be
- publicly exposed, you will need to either filter it out at your load
- balancer, or use the second method.
- The second method runs the metrics server on a different port, in a
- different thread to Synapse. This can make it more resilient to heavy load
- meaning metrics cannot be retrieved, and can be exposed to just internal
- networks easier. The served metrics are available over HTTP only, and will
- be available at ``/``.
- Add a new listener to homeserver.yaml::
- listeners:
- - type: metrics
- port: 9000
- bind_addresses:
- - '0.0.0.0'
- For both options, you will need to ensure that ``enable_metrics`` is set to
- ``True``.
- Restart Synapse.
- 3. Add a Prometheus target for Synapse.
- It needs to set the ``metrics_path`` to a non-default value (under ``scrape_configs``)::
- - job_name: "synapse"
- metrics_path: "/_synapse/metrics"
- static_configs:
- - targets: ["my.server.here:port"]
- where ``my.server.here`` is the IP address of Synapse, and ``port`` is the listener port
- configured with the ``metrics`` resource.
- If your prometheus is older than 1.5.2, you will need to replace
- ``static_configs`` in the above with ``target_groups``.
- Restart Prometheus.
- Renaming of metrics & deprecation of old names in 1.2
- -----------------------------------------------------
- Synapse 1.2 updates the Prometheus metrics to match the naming convention of the
- upstream ``prometheus_client``. The old names are considered deprecated and will
- be removed in a future version of Synapse.
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | New Name | Old Name |
- +=============================================================================+=======================================================================+
- | python_gc_objects_collected_total | python_gc_objects_collected |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | python_gc_objects_uncollectable_total | python_gc_objects_uncollectable |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | python_gc_collections_total | python_gc_collections |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | process_cpu_seconds_total | process_cpu_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_client_sent_transactions_total | synapse_federation_client_sent_transactions |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_client_events_processed_total | synapse_federation_client_events_processed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_event_processing_loop_count_total | synapse_event_processing_loop_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_event_processing_loop_room_count_total | synapse_event_processing_loop_room_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_count_total | synapse_util_metrics_block_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_time_seconds_total | synapse_util_metrics_block_time_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_ru_utime_seconds_total | synapse_util_metrics_block_ru_utime_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_ru_stime_seconds_total | synapse_util_metrics_block_ru_stime_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_db_txn_count_total | synapse_util_metrics_block_db_txn_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_db_txn_duration_seconds_total | synapse_util_metrics_block_db_txn_duration_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_util_metrics_block_db_sched_duration_seconds_total | synapse_util_metrics_block_db_sched_duration_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_start_count_total | synapse_background_process_start_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_ru_utime_seconds_total | synapse_background_process_ru_utime_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_ru_stime_seconds_total | synapse_background_process_ru_stime_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_db_txn_count_total | synapse_background_process_db_txn_count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_db_txn_duration_seconds_total | synapse_background_process_db_txn_duration_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_background_process_db_sched_duration_seconds_total | synapse_background_process_db_sched_duration_seconds |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_storage_events_persisted_events_total | synapse_storage_events_persisted_events |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_storage_events_persisted_events_sep_total | synapse_storage_events_persisted_events_sep |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_storage_events_state_delta_total | synapse_storage_events_state_delta |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_storage_events_state_delta_single_event_total | synapse_storage_events_state_delta_single_event |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_storage_events_state_delta_reuse_delta_total | synapse_storage_events_state_delta_reuse_delta |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_server_received_pdus_total | synapse_federation_server_received_pdus |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_server_received_edus_total | synapse_federation_server_received_edus |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_notified_presence_total | synapse_handler_presence_notified_presence |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_federation_presence_out_total | synapse_handler_presence_federation_presence_out |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_presence_updates_total | synapse_handler_presence_presence_updates |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_timers_fired_total | synapse_handler_presence_timers_fired |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_federation_presence_total | synapse_handler_presence_federation_presence |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handler_presence_bump_active_time_total | synapse_handler_presence_bump_active_time |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_client_sent_edus_total | synapse_federation_client_sent_edus |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_client_sent_pdu_destinations_count_total | synapse_federation_client_sent_pdu_destinations:count |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_federation_client_sent_pdu_destinations_total | synapse_federation_client_sent_pdu_destinations:total |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_handlers_appservice_events_processed_total | synapse_handlers_appservice_events_processed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_notifier_notified_events_total | synapse_notifier_notified_events |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter_total | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter_total | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_http_httppusher_http_pushes_processed_total | synapse_http_httppusher_http_pushes_processed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_http_httppusher_http_pushes_failed_total | synapse_http_httppusher_http_pushes_failed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_http_httppusher_badge_updates_processed_total | synapse_http_httppusher_badge_updates_processed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- | synapse_http_httppusher_badge_updates_failed_total | synapse_http_httppusher_badge_updates_failed |
- +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
- Removal of deprecated metrics & time based counters becoming histograms in 0.31.0
- ---------------------------------------------------------------------------------
- The duplicated metrics deprecated in Synapse 0.27.0 have been removed.
- All time duration-based metrics have been changed to be seconds. This affects:
- +----------------------------------+
- | msec -> sec metrics |
- +==================================+
- | python_gc_time |
- +----------------------------------+
- | python_twisted_reactor_tick_time |
- +----------------------------------+
- | synapse_storage_query_time |
- +----------------------------------+
- | synapse_storage_schedule_time |
- +----------------------------------+
- | synapse_storage_transaction_time |
- +----------------------------------+
- Several metrics have been changed to be histograms, which sort entries into
- buckets and allow better analysis. The following metrics are now histograms:
- +-------------------------------------------+
- | Altered metrics |
- +===========================================+
- | python_gc_time |
- +-------------------------------------------+
- | python_twisted_reactor_pending_calls |
- +-------------------------------------------+
- | python_twisted_reactor_tick_time |
- +-------------------------------------------+
- | synapse_http_server_response_time_seconds |
- +-------------------------------------------+
- | synapse_storage_query_time |
- +-------------------------------------------+
- | synapse_storage_schedule_time |
- +-------------------------------------------+
- | synapse_storage_transaction_time |
- +-------------------------------------------+
- Block and response metrics renamed for 0.27.0
- ---------------------------------------------
- Synapse 0.27.0 begins the process of rationalising the duplicate ``*:count``
- metrics reported for the resource tracking for code blocks and HTTP requests.
- At the same time, the corresponding ``*:total`` metrics are being renamed, as
- the ``:total`` suffix no longer makes sense in the absence of a corresponding
- ``:count`` metric.
- To enable a graceful migration path, this release just adds new names for the
- metrics being renamed. A future release will remove the old ones.
- The following table shows the new metrics, and the old metrics which they are
- replacing.
- ==================================================== ===================================================
- New name Old name
- ==================================================== ===================================================
- synapse_util_metrics_block_count synapse_util_metrics_block_timer:count
- synapse_util_metrics_block_count synapse_util_metrics_block_ru_utime:count
- synapse_util_metrics_block_count synapse_util_metrics_block_ru_stime:count
- synapse_util_metrics_block_count synapse_util_metrics_block_db_txn_count:count
- synapse_util_metrics_block_count synapse_util_metrics_block_db_txn_duration:count
- synapse_util_metrics_block_time_seconds synapse_util_metrics_block_timer:total
- synapse_util_metrics_block_ru_utime_seconds synapse_util_metrics_block_ru_utime:total
- synapse_util_metrics_block_ru_stime_seconds synapse_util_metrics_block_ru_stime:total
- synapse_util_metrics_block_db_txn_count synapse_util_metrics_block_db_txn_count:total
- synapse_util_metrics_block_db_txn_duration_seconds synapse_util_metrics_block_db_txn_duration:total
- synapse_http_server_response_count synapse_http_server_requests
- synapse_http_server_response_count synapse_http_server_response_time:count
- synapse_http_server_response_count synapse_http_server_response_ru_utime:count
- synapse_http_server_response_count synapse_http_server_response_ru_stime:count
- synapse_http_server_response_count synapse_http_server_response_db_txn_count:count
- synapse_http_server_response_count synapse_http_server_response_db_txn_duration:count
- synapse_http_server_response_time_seconds synapse_http_server_response_time:total
- synapse_http_server_response_ru_utime_seconds synapse_http_server_response_ru_utime:total
- synapse_http_server_response_ru_stime_seconds synapse_http_server_response_ru_stime:total
- synapse_http_server_response_db_txn_count synapse_http_server_response_db_txn_count:total
- synapse_http_server_response_db_txn_duration_seconds synapse_http_server_response_db_txn_duration:total
- ==================================================== ===================================================
- Standard Metric Names
- ---------------------
- As of synapse version 0.18.2, the format of the process-wide metrics has been
- changed to fit prometheus standard naming conventions. Additionally the units
- have been changed to seconds, from miliseconds.
- ================================== =============================
- New name Old name
- ================================== =============================
- process_cpu_user_seconds_total process_resource_utime / 1000
- process_cpu_system_seconds_total process_resource_stime / 1000
- process_open_fds (no 'type' label) process_fds
- ================================== =============================
- The python-specific counts of garbage collector performance have been renamed.
- =========================== ======================
- New name Old name
- =========================== ======================
- python_gc_time reactor_gc_time
- python_gc_unreachable_total reactor_gc_unreachable
- python_gc_counts reactor_gc_counts
- =========================== ======================
- The twisted-specific reactor metrics have been renamed.
- ==================================== =====================
- New name Old name
- ==================================== =====================
- python_twisted_reactor_pending_calls reactor_pending_calls
- python_twisted_reactor_tick_time reactor_tick_time
- ==================================== =====================
|