Monday, February 10, 2014

New job Sr. Site Reliability Engineer

LinkedIn Following

Group: IT Recruiters
Subject: Sr. Site Reliability Engineer

Kathy Maskery posted a job: Sr. Site Reliability Engineer

"Our direct client located in Mineola, NY is looking to add multiple level Fulltime Site Reliability Engineers (SREs) to their team. These high level SREs will be responsible for broad production support and application troubleshooting experience in a Linux and Open Source web-based software environment. Site Reliability Engineers (SRE) fill the mission-critical role of ensuring that the client's complex, web-scale systems are healthy, monitored, automated, and designed to scale. The engineer will use a unique combination of software development, networking, and systems administration expertise to tackle challenging situations every day. Responsibilities include engineering reliability into code, infrastructure, OS, network, and processes used to ensure the application is always fast, available, and scalable. This includes delving into how software performs, packets flow, and hardware and code interact, in support of managing services, and predicting and preventing failures. The successful candidate will also need to effectively guide incident response of cross-functional support teams to troubleshoot and address database, OS, application, network and any other issues. Successful coordination of efforts of other teams within the company will be an essential responsibility. The candidate will be looked upon as an expert and advocate to fellow engineers on making design and reliability trade-offs in running large- scale services and engineering complex systems that fail gracefully and transparently to users. To be successful, the engineer will need a passion for technology, the desire to be in the center of the action, and the ability to routinely tackle complex software and systems issues. Essential Functions: * Manage the availability, latency, scalability and efficiency of company services by engineering reliability into software and systems. * Serve as a primary point responsible for the overall health, performance, and capacity of our internet-facing systems. * Respond to and resolve emergent service problems; build tools and automation to prevent problem recurrence. * Participate in software and system performance analysis and tuning, service capacity planning and demand forecasting. * Perform periodic on-call duty as part of a 24/7/365 team. Qualifications: * Bachelor's Degree in Computer Science or equivalent practical experience. * 5+ years engineering and/or administering a high-volume or critical production service environment running on a UNIX/Linux platform. * Strong working knowledge of C, C++ or Java and Shell, Perl or Python. * Hands-on experience in Apache, JBoss, Tomcat, Oracle, Load Balancers (F5) and Firewalls. * Understanding of IP networking, network devices and common topologies. * Proven technical troubleshooting and performance tuning experience. * Excellent analytical skills, coupled with a strong sense of ownership, urgency and drive. * Ability to troubleshoot and resolve customer problems that arise and with a high degree of independence.as well as manage multiple task assignments. * Excellent written communication skills. Desired Skills: * PureData (DB2), Mongo, Redis, Qpid (MRG), Data Power, Mule Apply here: http://www.aplitrak.com/?adid=a2F0aGxlZW5tLjU3OTAwLjMzNzBAZGVzaWduc3RyYXRlZ3kuYXBsaXRyYWsuY29t"

Go see this job post »

Don't want to get activity notifications: Change your following people settings »

Learn more about following people's activity

Linkedin Jobs

Related Links

Monday, February 10, 2014

New job Sr. Site Reliability Engineer

LinkedIn Following

No comments:

Post a Comment

Recent Pageviews