Compute Lead
Company: Leidos
Location: Annandale
Posted on: September 13, 2023
Job Description:
Description The 1901 Group at Leidos seeks a Compute Lead to
support the Federal Trade Commission (FTC) Information Management
Services & Support (IMSS) program. The Compute Lead will serve as
the primary POC for all operations, including, but not limited to
designing, implementing, migrating, and maintaining the on-premises
and cloud computing platforms for FTC, and ensuring the successful
interoperability of infrastructure components. The Compute Lead is
responsible for planning, designing, and troubleshooting server
deployments, configuring multi-server platforms for maximum
performance and availability, implementing disaster recovery
operations, and working with stakeholders to select compute
solutions appropriate for FTC. The Compute Lead will lead systems
management functions and participate in the design and
implementation of server systems; as well as manage a team of
Windows and Unix engineers and administrators to ensure
accomplishment of projects and tasks required to maintain and
advance the infrastructure.This position will oversee operations
and administration activities related to the performance of
routine, preventive, predictive, scheduled, and unscheduled actions
to maintain system availability by preventing equipment or system
failure or reduced performance, with the goal of increasing
efficiency, reliability, and safety of the facility, IT equipment,
systems, and applications.This position requires clear
communication with internal team members and across multiple task
areas and clients, as well as external organizations (e.g.,
sub-contractors, vendors, etc.). The position also works to
influence project/team leaders regarding solution design, process
and/or approaches to continuously improve and modernize the
customer's compute infrastructure.Primary Responsibilities:
- Manage a team of IT professionals responsible for supporting
the computing platforms within the FTC IT infrastructure.
- Manage all physical, virtual, and cloud-based server
environments.
- Ensure achievement of relevant program Service Level Agreements
(SLAs).
- Review and monitor the state of the server environment. React
with urgency to identified issues.
- Ensure sound engineering and testing of all infrastructure
deployments.
- Initiate compute infrastructure enhancements. Recommend changes
to improve overall compute reliability including technology and
process. Identify and support strategic improvements to the compute
environment including the migration of components and services to
the cloud.
- Ensure development and maintenance of documentation describing
the compute environment including, but not limited to, architecture
diagrams, design documents, standard operating procedures (SOPs),
and tracking lists.
- Adhere to the customer's change management process. Ensure all
changes are well documented with test plans, architecture and
design documentation, roll-back plans, and any other documentation
required by the change process.
- Ensure Configuration Management (CM) of all compute
configuration items in the infrastructure.
- Ensure adherence to Standard Operating Procedures (SOPs)
- Ensure use of customer ticket management solution (ServiceNow)
to log and track all activities.
- Perform routine and urgent patching, as well as vulnerability
management of all servers in accordance with FTC security
requirements and defined SLA standards.
- Ensure rapid resolution of all tickets assigned to the team
including, but not limited to, upgrades and troubleshooting as well
as server and application systems configuration.
- Support asset tracking of compute assets and maintain accurate
inventory of the environment.
- Support program reporting requirements.
- Work across teams to identify and resolve complex
problems.
- Serve as a member of the on-call and after-hours support
rotation.
- Support after hours and weekend maintenance activities as
required.
- Work onsite at customer site in Washington, D.C. at least two
business days per week, 8 AM - 5 PM Eastern. Up to 50% remote work
is allowable with flexibility to support maintenance activities and
infrequent unplanned events.
- Attend technical and recurring status meetings as
required.Basic Qualifications:
- BA/BS degree in a technical IT domain including Windows or Unix
environments; and 5+ years or prior relevant experience; or
Masters' degree with 3+ years of prior relevant experience.
Additional years of experience will be acceptable in lieu of a
degree.
- Must have at least one of the following current, unexpired
certifications: CompTIA Security+, ISC(2) Systems Security
Certified Practitioner (SSCP), GIAC Certified Windows Security
Administrator (GCWN); Higher level (IAT-3) DoD 8570 certifications
are acceptable.
- Strong communication skills - excellent verbal and written
skills.
- Familiarity with process frameworks such as the ITIL
framework.
- Excellent technical and strong quantitative, analytical, and
conceptual thinking skills.
- Experience with server management in a similarly sized
environment (150+ servers)
- Experience using an enterprise monitoring solution such as
SolarWinds or ScienceLogic SL1 for monitoring, capacity analysis
and troubleshooting.
- Experience with Azure cloud for servers and services.
- Experience with Windows and Unix server architecture, design,
and implementation.
- Experience with best practices for Windows Server 2022
on-premises and in Azure Cloud.
- Familiarity with Microsoft technologies including Active
Directory, SQL, Exchange, and file/print services.
- Experience with data backup/recovery designs and
implementations.
- Ability to work overtime and support off-hours maintenance
windows as required.
- Ability to obtain public trust clearance. Must be a US
Citizen.
- Ability to perform physical work activities including standing,
walking, bending, squatting, (e.g., while deploying or
de-commissioning servers) and lifting up to 50 lbs.Preferred
Qualifications:
- Ten+ (10+) years of extensive experience in a server
environment of at least 1,500 users to include management,
administration, and deployment supporting IT infrastructures and
deep understanding of server technologies, virtualization, network
protocols, storage solutions, disaster recovery strategies, and
security principles.
- Certifications including:
- Microsoft Certified Solutions Expert (MCSE): Cloud Platform and
Infrastructure
- Microsoft Certified: Azure Solutions Architect
- Microsoft Azure Administrator Associate Certification
- Microsoft Azure Security Engineer Associate Certification
- VMware Certified Professional (VCP)
- Unix/Red Hat Certified Engineer (RHCE)
- NetApp Certified Data Management Administrator (NCDA)
- Amazon Web Services (AWS) Certified Solutions Architect
- Google Cloud Certified - Professional Cloud Architect
- Cisco Certified Network Associate (CCNA)
- Familiarity with SAN and NAS storage administration in Windows
and Unix server environments. Pay Range: Pay Range $97,500.00 -
$150,000.00 - $202,500.00 The Leidos pay range for this job level
is a general guideline only and not a guarantee of compensation or
salary. Additional factors considered in extending an offer include
(but are not limited to) responsibilities of the job, education,
experience, knowledge, skills, and abilities, as well as internal
equity, alignment with market data, applicable bargaining agreement
(if any), or other law.
Keywords: Leidos, Annandale , Compute Lead, Other , Annandale, Virginia
Didn't find what you're looking for? Search again!
Loading more jobs...