Director, Site Reliability Engineering Job at LendingPoint, Texas

OWJIa2pqRTAzUDFuQjJRU3c4SVh2c0JPM2c9PQ==
  • LendingPoint
  • Texas

Job Description

Job Title: Director, Site Reliability Engineering

Reports To: SVP, QA

FLSA Status: Exempt

Department: Technology

JOB SUMMARY:

Responsible for leading the strategy, architecture, and operations of the Site Reliability Engineering (SRE) function at LendingPoint. This includes overseeing infrastructure automation, DevSecOps, CI/CD pipelines, observability, release management, system stability, and incident response. The Director acts as a high-level technical decision-maker-establishing technical standards, guiding architectural decisions, and ensuring the reliability and scalability of systems to support business goals.

ESSENTIAL JOB FUNCTIONS:

•Provide day-to-day leadership to the SRE team, ensuring effective operations, growth, and innovation.

•Manage cloud-native infrastructure, including servers, container clusters, databases, and networks across AWS/GCP/Azure.

•Design and scale CI/CD pipelines and observability tools (Grafana, Prometheus, Dynatrace, Full Story, etc.) for production-grade environments.

•Oversee release planning, coordination, risk mitigation, and change control across engineering and business stakeholders.

•Implement proactive monitoring, alerting, and incident response systems to ensure performance and reliability.

•Lead capacity planning and scaling efforts for high-growth environments and services.

•Drive automation initiatives to optimize operations, reduce manual effort, and improve service quality.

•Manage vendor relationships with cloud providers, data centers, and infrastructure partners to uphold SLAs and resolve issues efficiently.

•Own disaster recovery and business continuity strategies to minimize downtime and ensure data resilience.

•Develop and maintain infrastructure and operational documentation; provide internal training as needed.

•Guide cross-functional release planning across Product, QA, Engineering, and IT Ops to align with business goals.

•Lead retrospectives for major incidents and continuously improve recovery time and system reliability.

•Promote a culture of continuous improvement, learning, and engineering excellence within the team.

MINIMUM QUALIFICATIONS:

•Bachelor's degree in computer science or related discipline, preferred.

•10+ years of experience in SRE or DevOps roles supporting high-scale systems.

•5+ years of experience leading SRE/DevOps or release teams.

•Strong expertise in Kubernetes administration, Docker container orchestration, and infrastructure as code (IaC).

•Experience managing production infrastructure on AWS, Azure, or Google Cloud Platform.

•Deep knowledge of monitoring, logging, and alerting tools such as Prometheus, Dynatrace, Full Story, or Nagios.

•Hands-on experience with CI/CD tools (e.g., GitLab CI, Jenkins), IaC (Terraform), and scripting languages (Python, Bash, Go).

•Strong programming background in Java, with experience building and scaling microservices-based platforms.

•Solid understanding of web/API technologies (REST, JSON), observability, and API gateways.

•Experience managing environments across development, QA, staging, and production tiers.

•Proven ability to lead disaster recovery planning, business continuity, and compliance enforcement.

•Certification in relevant areas (e.g., AWS, Azure Administrator, GCP Network Engineer) is a plus.

•Excellent analytical, troubleshooting, and decision-making skills for complex system problems.

•Strong verbal and written communication skills can interact at all levels of the organization.

COMPETENCIES:

•Customer Service: Exceptional attitude and a passion for providing outstanding service to internal customers.

•Analytical Skills: Proven capacity to extract and manipulate large datasets in an efficient manner.

•Communications: Exhibits good listening and comprehension. Expresses ideas and thoughts in verbal and written form. Strong presentation skills.

•Compliance & Risk Awareness - Enforces standards and policies to ensure secure, compliant operations.

•Infrastructure Management - Expert in managing cloud infrastructure, scalability, security, and platform efficiency.

•Observability & Incident Response - Establishes comprehensive monitoring and drives high-quality incident handling.

•Problem Solving - Tackles complex systems issues with data-driven strategies and root cause analysis.

•Release & Change Management - Effectively governs the release lifecycle, balancing speed with stability.

•Strategic Communication - Engages cross-functional teams and leadership with clarity, transparency, and influence.

•Team Leadership - Inspires and manages high-performing engineering teams with a focus on trust, agility, and resilience.

SUPERVISORY RESPONSIBILITY

Yes

PHYSICAL DEMANDS

While performing the duties of this job, the employee is regularly required to stand, walk, reach and sit for a minimum of 8 hours with or without reasonable accommodation. The employee is required to use hands to finger, handle, or feel objects and/or tools. The employee is required to talk or hear with or without reasonable accommodation and must sometimes lift and move up to 10 pounds.

WORK ENVIRONMENT

While performing the logistics duties of this job, the employee is frequently exposed to moderate noises such as computers, printers, and other light traffic noise in an office setting.

This role is in-office. Remote work may be performed from a pre-approved location, as arranged, and scheduled by team management and approved by department leadership.

OTHER DUTIES

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change or be supplemented at any time with or without notice.

Equal Opportunity Employer
This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor. LendingPoint

Job Tags

Work at office, Remote work,

Similar Jobs

Trusted Tech Team

Art Director Job at Trusted Tech Team

 ...the fastest-growing companies in the U.S., Canada, and the UK. Overview: We are seeking an experienced and creative Art Director to lead the visual direction of our brand and content across digital platforms. This role is instrumental in executing our demand... 

Piedmont Medical Care Corporation

Certified Medical Assistant - Outpatient Urology Specialists Job at Piedmont Medical Care Corporation

 ...RESPONSIBLE FOR: Responsible for providing a variety of patient care activities under the supervision of a physician, advanced practice...  ..., one (1) year of medical assistant or related clinical experience is required. MINIMUM EXPERIENCE REQUIRED: No experience... 

Ubiquiti Inc

M0141 - Thermal Engineer Job at Ubiquiti Inc

As a Thermal Design Engineer at Ubiquiti, you will play a vital role in ensuring the efficiency and reliability of our product's thermal designs. Your primary focus will be on validating and optimizing thermal design performance while working closely with the cross function... 

CarterMackay Holdings, Inc

Director/ Associate Director PK/ADME ( {{city}}) Job at CarterMackay Holdings, Inc

 ...drug development programs Will design PKPK, PK, and TK studies Will conduct data analysis and also manage CRO partners in outsourced studies Represents the function at internal and external meetings, ensures adherence of timelines, data, and impact to pipeline... 

NeerInfo Solutions

Site Reliability Engineer Job at NeerInfo Solutions

Client is seeking a SRE Consultant - The position will primarily be responsible for leading the maintenance and support on custom IT applications. The selected candidate should have good technical knowledge, analytical ability, good communication and previous support experience...