Basis
Data Scientist II – Performance Optimization Squad
Geplaatst: 20.05.2026
Sluitingsdatum: 04.07.2026
Functiereferentie: d5401b16196480b9607e15c0cbf75767
Functie-informatie
Locatie
Stockholm, Stockholm County, Sweden
Bedrijf
TN Sweden
Klant / Werkgever
Spotify
Functiereferentie
d5401b16196480b9607e15c0cbf75767
Vermeldingstype
Basis
EU-werkvergunning vereist
Nee
Geplaatst
20.05.2026
Sluitingsdatum
04.07.2026
Functiebeschrijving
The Performance Optimization Squad is a newly formed team in the Core Infrastructure Studio with a mission to establish a core competency in performance engineering and address systemic inefficiencies across Spotify's platform. With 3,200+ microservices, 40,000 VMs at peak, and 500,000 K8s pods, even minor fleet-wide efficiency improvements result in substantial cost savings. We've already identified $8M+ in annual savings from our first initiative alone, and we're just getting started.This squad works horizontally across the entire stack — but none of that optimization happens without the data to see it. We're looking for a Data Analyst who will build the measurement foundation that drives every decision we make. What You'll Do Optimization at this scale is only possible when someone can see the problem clearly. You'll build the data foundation from scratch; designing and owning the datasets, pipelines, and metrics that make performance inefficiencies visible across the platform. Design, build, and maintain datasets and data pipelines that surface resource utilization, cost, and performance signals across Spotify's infrastructure Define and own metrics for efficiency, latency, and resource utilization; turning raw infrastructure signals into insights that drive prioritization Proactively investigate performance data to surface optimization opportunities, not just respond to engineering requests Build dashboards and analyses that support decision-making across the squad and partnering platform teams Work with engineers and platform teams to define guardrail metrics, validate findings, and measure the real-world impact of optimization efforts Translate complex infrastructure data into clear stories for both technical and non-technical audiences Own the data foundation: There is no inherited data infrastructure here; you'll design and build it from scratch. What gets measured, and how, is yours to define See your impact directly: Every insight you surface translates into cost savings. We measure success in dollars saved and efficiency gained; your work shows up in production Breadth at scale: Work across the entire Spotify platform; 3,200+ microservices, 40,000 VMs, 500,000 K8s pods. Few companies offer data problems at this scale Greenfield from day one: Help shape the culture, tooling, and data strategy of a brand new squad with strong executive support Who You Are You have experience working with infrastructure, platform, or cloud cost data; Kubernetes metrics, cost attribution, utilization signals, or observability data feel familiar Or you're a strong technical analyst with enough grounding in distributed systems and cloud infrastructure to navigate GKE cost data, JVM metrics, and resource utilization signals You write clean, efficient SQL and Python; comfortable enough to model data and build lightweight pipelines, not just query existing tables You're self-directed: at your best when hunting for problems in the data, not waiting to be handed them Comfortable with ambiguity and able to carve your own path in an early-stage, unstructured environment Experienced with data visualization tools (Looker or similar) and know how to make a dashboard tell a story, not just display numbers You communicate clearly and confidently with engineers and non-technical stakeholders alike Where You'll Be This role is based in Stockholm or London. We offer you the flexibility to work where you work best! There will be some in person meetings, but still allows for flexibility to work from home.
Vaardigheden
apply blended learning
apply for research funding
apply research ethics and scientific integrity principles in research activities
build recommender systems
Business Analytics
Business Intelligence
collect ICT data
communicate with a non-scientific audience
Computational Biology
Computer Simulation
conduct research across disciplines
create data models
Data Engineering
data ethics
Data Mining
Data Models
data quality assessment
Data Science
data visualisation software
define data quality criteria
deliver visual presentation of data
demonstrate disciplinary expertise
design database in the cloud
design database scheme
develop data processing applications
develop professional network with researchers and scientists
Digital Curation
disseminate results to the scientific community
draft scientific or academic papers and technical documentation
empirical analysis
establish data processes
evaluate research activities
execute analytical mathematical calculations
Hadoop
handle data samples
Healthcare Analytics
image recognition
implement data quality processes
increase the impact of science on policy and society
information categorisation
Information Extraction
integrate gender dimension in research
integrate ICT data
interact professionally in research and professional environments
interpret current data
LDAP
LINQ
make data-driven decisions
manage data
manage data collection systems
manage findable accessible interoperable and reusable data
manage ICT data architecture
manage ICT data classification
manage intellectual property rights
manage open publications
manage personal professional development
manage research data
Marketing Analytics
mathematical modelling
MDX
mentor individuals
multidisciplinary research
N1QL
normalise data
online analytical processing
operate open source software
perform data cleansing
perform data mining
perform project management
perform scientific research
promote open innovation in research
promote the participation of citizens in scientific and research activities
promote the transfer of knowledge
publish academic research
quantitative analysis
query languages
report analysis results
Research Design
resource description framework query language
Scientific Computing
scientific literature
Social Network Analysis
SPARQL
speak different languages
State Estimation
statistical modeling techniques
Statistics
synthesise information
teach in academic or vocational contexts
think abstractly
Unstructured Data
use data processing techniques
use databases
use spreadsheets software
visual presentation techniques
write scientific publications
XQuery