Image Reference : https://mp.weixin.qq.com/s/nAF3lv-qZprLWvOdvSbYXg
Observability refers to the extent to which a system's internal states can be inferred from its external outputs. In mathematics, observability and controllability are dual concepts.
In modern software systems and cloud computing, observability plays an increasingly important role in ensuring the reliability, performance, and security of applications and infrastructure. As software systems become more complex, with widespread adoption of microservices and increasing reliance on distributed architectures, the importance of observability becomes more pronounced.
Observability mainly includes the following aspects:
Observability tools help system administrators and developers collect and analyze the above data, thus improving understanding and control of the system.
Applications of observability include:
From the era of monolithic applications to the era of microservices, the dimensions of monitoring data (metrics, logs, traces) have evolved as follows:
In the era of monolithic applications, applications were typically deployed as a single unit on a server. Therefore, the basis of monitoring data was usually singular, such as server CPU, memory, network metrics, etc.
In the SOA era, applications were split into multiple independent services, each of which could be developed, deployed, and managed independently. Thus, the basis of monitoring data became more complex, requiring attention to the resource usage and performance metrics of each service.
In the era of microservices, applications are split into even finer-grained microservices, each typically responsible for a specific business function. Therefore, the basis of monitoring data became even more extensive, requiring attention to the resource usage, performance metrics, and tracing of each microservice.
The dimensions of metrics, logs, and tracing in different eras are summarized as follows:
Era | Metrics | Logs | Tracing |
---|---|---|---|
Monolithic | Server resource usage, etc. | Application logs | None |
SOA | Service resource usage, etc. | Service logs | Service call links |
Microservices | Microservice resource usage, etc. | Microservice logs | Microservice call links |
As application architectures evolve, the dimensions of monitoring data have grown increasingly extensive, posing higher demands on the design and implementation of monitoring systems. Monitoring systems must be capable of collecting, storing, and analyzing monitoring data from various sources and dimensions, providing comprehensive support for application maintenance.
Traditional resource-focused monitoring primarily addresses the operational status of systems, including overall health and performance bottlenecks. Traditional resource monitoring typically uses metrics to measure system status, such as CPU usage, memory usage, network traffic, etc.
Application observability, on the other hand, focuses not only on system status but also on application business logic and data. Application observability typically uses logs, tracing, and other technologies to collect and analyze data produced during application runtime.
Aspect | Traditional Resource Monitoring | Application Observability |
---|---|---|
Focus | System operational status | Application status, business logic, data |
Data Sources | Metrics | Logs, tracing |
Traditional resource monitoring is a part of application observability. Application observability needs to collect and analyze system status metrics, often provided by traditional resource monitoring.
Traditional resource monitoring is typically limited to the system level, such as servers, containers, databases, etc. Application observability can extend to the application level, including business logic, data, etc.
In summary, resource monitoring and application observability are related but distinct concepts. Traditional resource monitoring is a part of application observability, providing a foundation for it. Application observability can extend to the application level, supporting analysis of business logic and data.
System monitoring primarily focuses on the operational status of systems, including overall health and performance bottlenecks. System monitoring typically uses metrics to measure system status, such as CPU usage, memory usage, network traffic, etc.
Application observability, in contrast, focuses not only on system status but also on application business logic and data. Application observability typically uses logs, tracing, and other technologies to collect and analyze data produced during application runtime.
The differences between system monitoring and application observability can be summarized as follows:
Aspect | System Monitoring | Application Observability |
---|---|---|
Analysis Purpose | Fault localization, performance optimization | Fault localization, performance optimization, business logic analysis, data understanding |
Monitoring Metrics | CPU, Memory, Usage, Load | SLOs, SLIs, Time measurements, Event measurements, Availability |
For example, SLOs are the service level objectives of an application, SLIs measure SLOs, time and event measurements help analyze business logic, and availability helps understand data situations.
Suggestions for addressing the evolution of monitoring data include:
As application architectures evolve, the methods for storing monitoring data have also changed. In the era of monolithic applications, file storage was sufficient for monitoring data needs. In the SOA and microservices era, distributed databases such as TSDB and NoSQL are required. In the future, with the growth of monitoring data volumes and analytical demands, emerging database technologies like graph databases will play an increasingly important role in monitoring data storage.
Storage Method | Data Model | Storage Efficiency | Query Efficiency | Suitable Data Types | Applicable Scenarios | Limitations |
---|---|---|---|---|---|---|
File Storage | Unstructured | Low | Low | All | Simple Data Storage | Complex Data Management, Poor Scalability |
SQLDB | Relational | High | High | Structured | Data Analysis | Poor at Storing Unstructured Data, Limited Horizontal Scaling |
TSDB | Time-Series | High | High | Time-Series Data | Monitoring Metrics | Poor at Storing Unstructured Data, Limited Data Types Supported |
NoSQL | Non-Relational | High | Low to High | All | Diverse Data Storage | Flexible Data Model, Less Efficient Queries than Relational Databases |
Row Database | Row | High | High | Structured | Log Data | Flexible Data Model |
Column Database | Column | High | High | Unstructured | Link Tracing Data | Flexible Data Model |
Graph Database | Graph | High | High | Relational Data | Application Topology | Flexible Data Model |
Monitoring System | Metric Data | Log Data | Link Tracing Data |
---|---|---|---|
Nagios | File Storage | File Storage | Not Supported |
Zabbix | SQLDB | SQLDB | Not Supported |
Prometheus | TSDB | TSDB | Not Supported |
Observability Platform | TSDB | NoSQL | NoSQL/Graph Database |
Column and graph databases have become mainstream choices due to their storage efficiency and scalability. In the realm of AI-assisted monitoring (AIGC), vector databases play a crucial role.
Combine different software components to build an observability platform tailored to specific needs.
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。