- Leads production support triage efforts, manages bridge line troubleshooting, engages in technical research, and escalates issues to leadership as needed.
- Ensures all impacts are accurately recorded and documented in the system of record, oversees that documents and wikis are updated and available for use during triage, and supports the documentation of application flows, upstream/downstream impacts during outages, the customer experience, and contacts for support needs
location: Plano, Texas
job type: Contract
salary: $36.35 - 46.35 per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
- Identifies and/or validates business impacts through interpretation of monitors, dashboards, and logs to communicate with leadership and vendors
- Manages activities to identify incident root cause, resolution, preventative actions, and change requests, and reports on incident data quality
- Promotes and enforces production governance during triage/testing and identifies production failure scenarios, vulnerabilities, and opportunities for improvement
- Serves as a subject matter expert for applications within a portfolio, leveraging extensive knowledge of application functionalities and application flows
qualifications:
- Assesses and prioritizes research requests, ad hoc reports, and offline incidents at the direction of senior team members and delegates work as needed to team members and peers
- Runs start-of-day application health checks
- Performs traffic routing, takes servers out of tier, adds servers into tier, takes java core dumps for root cause analysis, recycles jvms, warm up jvms, and adds servers into tier during production issues on need basis
- Reviews/updates application MRDs (monitoring requirement documents) for proper monitoring
- Provides application release/Change Support
- Reviews upcoming changes and the change runbooks(Detailed task steps)
- Executes the approved changes in production without any errors
- Supports ARC/DR/Data center Isolation exercises
- Must have excellent oral and written communication skills
skills:
- Identify opportunities for monitoring and automation.
- Develop tools, dashboards, reports, alerts using various tools like Splunk, Dynatrace to aid and improvise monitoring and day to day tasks to reduce resource overhead
- Identify stability and risk items in Production, work with various teams to remediate and ensure production environment is stable, available, and resilient
- Bachelor's degree in Computer Science, Information Technology, or related field
- Proven experience in production support or a related role
- Experience in supporting Java/Java web services based applications with high volume transactions
- Working knowledge of Splunk and Dynatrace tools to identify issues in production quickly
- Basic knowledge of using Unix/Linux commands to login into servers, fetch logs, copy/delete files, run shell scripts
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact HRsupport@randstadusa.com.
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.