Observability Engineer

Fairfax, VA
Full Time
IT
Experienced

DevOps Observability Engineer
Location:
100% Remote (U.S. Only)
Employment Type: Long-term Contact (36 months+)
Authorization: U.S. Citizen Only

About the Role:

We are seeking a highly skilled and dedicated DevOps Observability Engineer to join our evolving team. This is a critical role focused on building, enhancing, and maintaining robust monitoring, logging, and tracing solutions as we transition our infrastructure from on-premise environments to Microsoft Azure. You will be instrumental in ensuring the performance, reliability, and health of our systems, providing deep insights that drive operational excellence and proactive problem-solving.

Key Responsibilities:

  • Design and Implement Observability Solutions: Architect, implement, and manage comprehensive monitoring, logging, and tracing systems for both existing on-premise infrastructure and new Azure cloud environments.
  • Azure Migration Support: Play a key role in the migration to Azure, specifically designing and deploying observability tools and practices within the Azure ecosystem.
  • Tooling Expertise: Utilize and optimize tools such as Dynatrace, ELK Stack (Elasticsearch, Logstash, Kibana), and other relevant platforms to capture and visualize system metrics, logs, and traces.
  • Automated Alerting & Reporting: Develop and configure automated alerts, dashboards, and reports to provide real-time insights into system health, performance bottlenecks, and potential issues.
  • Performance Optimization: Analyze observability data to identify performance degradation, troubleshoot complex incidents, and recommend solutions for system optimization and stability.
  • Scripting & Automation: Write and maintain automation scripts (primarily in Python) for integrating observability tools, automating data collection, and streamlining operational tasks.
  • Incident Response & Root Cause Analysis: Support incident response efforts by providing critical data and analysis, facilitating rapid root cause identification and resolution.
  • Collaboration & Best Practices: Collaborate closely with development, operations, and security teams to embed observability best practices throughout the software development lifecycle.

Qualifications:

  • Experience: 5-7 years of progressive experience in DevOps roles.
  • Dedicated Observability Experience: Minimum of 2 years of dedicated experience specifically in DevOps Observability, focusing on implementing and managing monitoring, logging, and tracing solutions.
  • Cloud Proficiency: Strong hands-on experience with Microsoft Azure services, particularly those related to infrastructure, networking, and monitoring.
  • Observability Tools: Expert-level proficiency with Dynatrace, ELK Stack (Elasticsearch, Logstash, Kibana).
  • Scripting: Strong programming and scripting skills, particularly in Python, for automation and data manipulation.
  • Problem-Solving: Excellent troubleshooting, analytical, and problem-solving abilities.
  • Communication: Strong communication skills, both written and verbal, with the ability to convey complex technical information to diverse audiences.

Nice to Have:

  • Experience with other monitoring tools (e.g., Prometheus, Grafana, Splunk, Datadog).
  • Familiarity with containerization technologies (Docker, Kubernetes).
  • Experience with Infrastructure as Code (Terraform, Azure Resource Manager templates).
  • Background working with hybrid cloud environments (on-premise to cloud migration).
Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*