Search Topic

Uptime Monitoring Explained

amazee.io Uptime Monitoring Explained

Uptime Monitoring is essential, as your website often serves as your business's storefront or the hallway for your government office. In this environment, imagine walking up to a store only to find it closed during its supposed business hours. 

Frustrating, right?

This is where website uptime monitoring plays an important role, especially for e-commerce and transactional sites, where every second of availability counts. Your website not being available is about losing potential sales in those moments, resulting in a dent in your brand or organization’s reliability and customer trust.

What is Uptime Monitoring?

Uptime monitoring checks a website regularly to ensure it is available and functioning correctly. It's like having a dedicated sentinel on guard to alert you and your team when your site experiences downtime.

This vigilance is particularly critical for websites built on Content Management Systems (CMS) platforms like Drupal, WordPress, Joomla, or Typo3. These platforms power a significant portion of the web, from simple blogs to complex e-commerce sites and government portals. Hence, their functionality and performance directly impact the success and effectiveness of the businesses and organizations they represent. 

Ensuring these sites remain up and accessible is not just beneficial—it's imperative.

Uptime Monitoring Guarantees and the Cost-Availability Trade-off

When it comes to maintaining an online presence, be it a government portal or an e-commerce site, website uptime is one of the metrics that measures the reliability and availability of your site. Apart from availability, you can also measure other metrics like site speed. If you want to instrument the application further, you could even set performance metrics for your users' specific actions, such as account creation checkout processes—the sky's the limit.

At amazee.io, we root our uptime monitoring guarantees in a deep understanding of the critical role that reliable hosting plays in a website’s success. We also understand the importance of keeping your site accessible to users around the clock, so we offer robust uptime guarantees designed to meet our clients' enterprise-grade expectations. 

Our hosting packages have several availability guarantees or targets. 99.9% and 99.95%, which translate to us guaranteeing you have no more than

  • 43m 28s downtime monthly (99.9%)
  • 21m 44s  downtime monthly (99.95%)

However, achieving near-perfect uptime has its challenges and costs.

As with most things in technology, increasing system availability often comes with increased complexity and cost. High website availability setups require redundant systems, more sophisticated monitoring tools, and, often, a more substantial investment in infrastructure and human capital. This is the cost-availability trade-off, a consideration that businesses and developers must make when deciding on their hosting and monitoring solutions.

We aim to strike an optimal balance between cost and availability. Our approach is to provide scalable site uptime monitoring solutions that grow with your site's needs. By leveraging container-based technologies, such as Kubernetes, and employing a sophisticated uptime monitoring system that includes custom tooling (based on Blackbox exporter - a Prometheus exporter) and data visualization tools like Grafana, we ensure that we can maximize the uptime monitoring for the sites on our platforms without unnecessary expenditure. 

Our site uptime guarantees are not just numbers; they are a commitment to the reliability and success of your online presence.

Uptime Monitoring - The Importance for CMS Platforms

Uptime monitoring for websites built on CMS platforms is not merely a technical routine; it is a fundamental aspect of maintaining the health of these digital assets. Downtime, even if brief, can have disproportionately adverse effects on these sites, affecting everything from user experience to revenue loss and, ultimately, the bottom line of the businesses they represent.

Why Downtime Hits Harder for CMS Platforms

Websites and web applications are the primary means for businesses to engage with their audience, showcase products or services, and conduct transactions. The unavailability of those sites and applications can lead to the following consequences (not conclusive):

  1. Lost Revenue: For e-commerce sites, unavailability directly translates to lost sales. Customers needing access to the site may turn to competitors, resulting in both immediate and potential long-term loss of revenue.
  2. Damaged Reputation: Users expect reliability. Frequent or prolonged unavailability can tarnish a brand's reputation, making it harder to retain loyal users and attract new ones.
  3. Unhappy citizen stakeholders: Citizens increasingly expect the websites of their cities, towns, states, and government institutions to be always online. These sites are the first place people turn to in an emergency, be it private, environmental, or political.

Uptime Monitoring: How amazee.io does it

At amazee.io, we understand that website uptime is not just a metric — it's a promise to our clients that their websites are accessible and performing optimally at all times. Our approach to site uptime monitoring is built on a foundation of advanced technology and tailored strategies to ensure that every site we host meets the highest availability standards. Here's a closer look at how we achieve this.

The Role of Blackbox Exporter

One critical tool in our uptime monitoring arsenal is the Blackbox Exporter. This powerful instrument can probe HTTP, HTTPS, DNS, TCP, and ICMP endpoints. By employing Blackbox Exporter, we can comprehensively check a website's functionality, including response times, status codes, and similar metrics. Although Blackbox Exporter supports many functions, we primarily use the HTTP/HTTPS probes.

Primary vs. Secondary Domain Monitoring

Our friendly monitoring tool visits every primary domain every minute, and we check if we get a Statuscode 200 back. If not, we have a problem. We have a few additional HTTP Status Codes that we deem to be “all okay” just to limit the noise on the monitoring.

Every secondary domain gets a visit every 5 minutes.

What’s a primary or secondary domain, you ask? Great question! 

A primary domain is the first route listed in your .lagoon.yml. Every other route is considered a secondary domain. We adopted this strategy as we have customers with 10s or 100s of domains pointing to a single project. Suppose you suddenly check those domains every minute. In that case, you end up with a lot of artificial traffic that often dwarfs the real traffic or sometimes even leads to additional stress on the infrastructure. Our approach can strike a healthy balance between checking too often and creating blind spots.

All of this happens automatically and behind the scenes. Our automation even goes so far that if a customer moves away from our hosting services, we automatically stop monitoring their site after seeing that it has moved away. And, of course, also the other way around 😀

The Technical Backbone

Underneath the interfaces of the websites we host, we have a complex infrastructure designed for stability, scalability, and performance. 

The way modern infrastructure is built is very flexible. This flexibility also needs to be applied to our uptime monitoring practices. This wasn't something we were able to do from the first day. As we gradually moved from traditional to cloud native infrastructure, we also needed to refine our website uptime monitoring methods. This includes regular updates to our monitoring scripts, probes, and alert systems, ensuring we are always ahead of potential issues. But this exceeds the scope of what we’re talking about in this post; we are primarily looking into uptime monitoring today.

Uptime Monitoring: Data Visualization and Alerts

To turn the vast amounts of site uptime monitoring data we collect into actionable insights, we use Grafana for visualization and Alertmanager to alert our engineers if something is wrong.

Grafana allows us to create dashboards to display information aggregations on the availability of the sites our clusters host. This allows us to quickly identify issues, such as whether a single site is affected or if it’s a more widespread problem.

Alertmanager ensures that we receive alerts directly from our on-call engineers. These alerts are routed to our dedicated support team, which is ready to take immediate action 24/7.

Is amazee.io's Uptime Monitoring Right for You?

amazee.io has designed its availability monitoring system to cater to a broad spectrum of internal needs.

Through comprehensive uptime monitoring strategies, tailored alert systems, and advanced technologies, amazee.io ensures that your digital storefront or portal remains open and your content and functionality remain accessible to your audience around the clock.

However, recognizing that every website has its unique challenges and that every business has its specific goals, the question remains: 

Is amazee.io's uptime monitoring right for you?

amazee.io monitors 1,000s of domains every minute globally, whereas most customers likely only have a few dozen domains and URLs they care about. In these cases, tools such as OhDear or UptimeRobot are easily set up to monitor your site. They will provide all the information you need to monitor your web application based on your requirements. With some work in your project’s .lagoon.yml, you can even have your deployments rolling and updating your monitoring by talking to your uptime monitoring provider's API.

Ultimately, setting up website uptime monitoring according to your specific use case is crucial. While amazee.io's uptime monitoring is built to accommodate many scenarios, from high-traffic e-commerce platforms to content-rich educational sites, the ultimate decision hinges on a clear understanding of your operational requirements and growth aspirations.

More critically, we recommend you don’t over-monitor your site. If a few minutes of downtime are not mission-critical for your business, we don’t necessarily recommend spending more time than necessary, ensuring you account for every minute of downtime. On the other hand, if you run a loss should your web presence not be available, this might be a different story.

amazee.io is not just a hosting solution but a strategic digital enabler. We empower enterprises to sculpt their digital infrastructure according to their unique visions and operational needs - including complex challenges such as website uptime and availability monitoring. By working with our team of experts, you enjoy the benefits of 24/7 infrastructure without the additional headcount.

Are you interested in learning more about leveraging the amazee.io platform's fully monitored solution? Get in touch with us today!


Writer