Site Reliability Engineer/Production Support (Hybrid) || Atlanta, GA (Local) id-3023

Hi,
Hope you are doing well.

This is contract position, please revert me with updated resume if this JD matches your profile and you are interested,

 

Title: Site Reliability Engineer/Production Support (Hybrid)

Location: Atlanta, GA (Only Local Candidates)

Duration: 6+ Months

 

Job Description:

The Production Support Engineer II is responsible for providing day-to-day support for business-critical systems, ensuring operational stability, and quickly resolving incidents. This role focuses on resolving lower to medium-priority incidents, maintaining system health, and supporting the improvement of production environments through collaboration with senior engineers and cross-functional teams.

What You’ll Do (Responsibilities)

  • Identify, troubleshoot, and resolve lower to medium-priority technical issues with guidance from senior engineers, ensuring minimal disruption to business operations.
  • Support day-to-day monitoring of system performance and use monitoring tools (e.g., Splunk, Dynatrace, CloudWatch) to detect anomalies and take corrective actions.
  • Collaborate with cross-functional teams to resolve technical incidents and escalate higher-complexity issues to senior engineers as needed.
  • Assist in automating routine production support tasks by developing or modifying scripts and tools.
  • Maintain documentation for production issues, troubleshooting steps, and system configurations, contributing to the shared knowledge base.
  • Participate in incident, problem, and change management processes, following ITIL best practices.
  • Perform root cause analysis for recurring issues and assist senior engineers in implementing permanent fixes to improve system stability.
  • Support the implementation of process improvements to enhance system performance and minimize downtime.
  • Assist with mentoring and supporting junior-level engineers, providing guidance as needed.

**Other duties may be performed, both major and minor, which are not mentioned above. Specific activities may change from time to time.

Qualifications:

Necessary Qualifications:

The requirements listed below are representative of the knowledge, skill and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

  • Bachelor’s Degree and 4-7 years of experience or equivalent education and software engineering training or experience
  • Proficiency in using monitoring tools like Splunk, Dynatrace, or CloudWatch to detect and resolve system performance issues.
  • SRE (Site Reliability Engineer) skills
  • In-depth knowledge in information systems and ability to identify, apply, and implement IT best practices
  • Understanding of key business processes and competitive strategies related to the IT function
  • Ability to plan and manage projects and solve complex problems by applying best practices
  • Ability to provide direction and mentor less experienced teammates. Ability to interpret and convey complex, difficult, or sensitive information

Preferred Qualifications:

  • 4-8 years of experience in production support, systems administration, or related technical roles.
  • Experience with IT Service Management (ITSM) tools such as ServiceNow with solid understanding of incident, problem, and change management processes.
  • Familiarity with supporting Agile team/processes.
  • Experience in automation tools and scripting for production support tasks.
  • Banking or financial services experience
  • Experience with cloud technologies such as Configuration Management (ex. Terraform), CICD GitLab, Containerization (ex. Kubernetes), etc
  • AWS Certified Solutions Architect Associate a plus

Leave a Comment