A short while ago, I had the opportunity to speak with Kord Campbell of Loggly (see Are log files the beginning of Big Data? to read that article). Ali Basiri, software engineer at PagerDuty spent a few moments answering my typical customer profile questions. His answers follow:
Please introduce yourself and your organization
My name is Ali Basiri, Software Engineer at PagerDuty. PagerDuty is a SaaS-based alerting and on-call management system for IT sysadmins and developers. PagerDuty collects alerts from all of a company's IT monitoring tools and alerts the on-duty engineer if there's a problem. Alerts are dispatched via automated phone call, SMS, or email. PagerDuty also schedules on-call duty for your teams and ensures no failures ever go unnoticed by automatically escalating unanswered alerts to another team member.
What are you doing that requires a product like this?
At PagerDuty we manage many servers to provide a reliable service to our customers. It is extremely difficult to look through all the logs from all the servers, specially in chronological order. This is why we need Loggly – it does all the hard work for us.
What products did you consider before you made a decision?
We looked at Splunk and rsyslogd as potential solutions. Both of which required hosting, managing, and scaling our own servers without providing all the features we needed. It turns out that with Loggly we're able to get a higher quality of service and an overall lower cost.
Why did you select this one?
Loggly is extremely easy to setup, has a hacker friendly interface, provides us with real-time indexing of our logs, we can search the logs from anywhere, we don't have to deal with the maintenance costs associated with running an in house solution, and most importantly it has pretty graphs. Every time I use its text or graph search I continue to be amazed by how responsive it is.
What tangible benefits have you gotten through the use of this product?
Since we've setup Loggly, we're now able to quickly identify the cause of problems and bugs. We also setup Loggly alerts to be notified of specific scenarios that might occur in our app. Before we had Loggly, we had a script that downloaded a day's worth of logs onto a local dev machine and trudged through them. There are a couple of problems with that – first, the aggregate of logs are not in order, even though the logs from each server is in order. Second, its a time consuming process. With Loggly, the logs are now our first point of reference when we encounter problems or bugs. We've also been much quicker to respond to customer inquiries that require looking at the logs.
What advise would you offer others?
Setup a centralized logging solution early on in your product. It will save a lot of headaches later on and help you get to the root cause of bugs much quicker.
Thanks, Ali, for taking the time to answer my questions.