What Data Analyst Interviews Assess
Data analyst interviews test technical skills across SQL interview questions analyst scenarios, Excel data analysis interview competency, and visualization tool expertise (Tableau, Power BI). Companies probe how you extract insights from messy datasets, write efficient queries joining multiple tables, and communicate findings to non-technical stakeholders. Youâll face scenario-based challenges requiring you to identify trends, calculate business metrics, and recommend actions based on data. Looking to prepare across other tech roles? Explore our complete technical interview resources.
These data analyst interview questions cover SQL fundamentals (joins, aggregations, window functions), Excel techniques (pivot tables, VLOOKUP, conditional formatting), Tableau dashboard questions testing visualization best practices, and business insights analysis requiring you to translate raw data into actionable recommendations. Modern data analyst roles emphasize storytelling with data, understanding business context over pure technical manipulation, and building dashboards that drive decisions rather than just display numbers.
SQL Queries & Data Extraction
Q: Explain the difference between INNER JOIN, LEFT JOIN, and FULL OUTER JOIN.
INNER JOIN returns only matching rows from both tables. LEFT JOIN returns all rows from the left table plus matching rows from the right (unmatched right rows show NULL). RIGHT JOIN returns all rows from the right table plus matches from left. FULL OUTER JOIN returns all rows from both tables, filling unmatched sides with NULL. Use INNER JOIN when you need only matching records (orders with customers), LEFT JOIN when preserving all left table records matters (all customers even without orders).
Q: How do you find duplicate records in a table?
Use GROUP BY with HAVING COUNT(*) > 1 to identify duplicates. For example: SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1 finds duplicate email addresses. To see all duplicate rows with details, use window functions: SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) as rn FROM users) WHERE rn > 1. This shows second and subsequent occurrences of each duplicate.
Q: What are window functions and when would you use them?
Window functions perform calculations across row sets related to the current row without collapsing results like GROUP BY. Common functions include ROW_NUMBER() for ranking, LAG()/LEAD() for accessing previous/next rows, and SUM() OVER() for running totals. Use them for calculating moving averages, finding top N per category, or computing differences between consecutive rows. Example: SUM(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) calculates 7-day moving average.
Q: How do you optimize a slow SQL query?
First, use EXPLAIN to understand the query execution plan identifying bottlenecks. Add indexes on columns used in WHERE clauses, JOIN conditions, and ORDER BY. Avoid SELECT * retrieving only needed columns. Replace subqueries with JOINs when possible. For large aggregations, consider materialized views or pre-aggregated tables. Partition large tables by date or category. Ensure statistics are up-to-date for the query optimizer to make good decisions.
đĄ Pro tip: Interviewers test SQL understanding through real business scenarios. âFind customers who purchased in Q1 but not Q2â reveals whether you understand self-joins, NOT EXISTS, or EXCEPT clauses versus just memorizing syntax.
Excel Functions & Analysis
Q: Explain the difference between VLOOKUP and INDEX-MATCH.
VLOOKUP searches vertically in the first column of a range returning values from columns to the right. It requires lookup column to be leftmost and breaks if columns are inserted. INDEX-MATCH combines two functions offering more flexibility: MATCH finds the position, INDEX returns the value at that position. INDEX-MATCH works with lookup columns anywhere, handles left lookups, and doesnât break when columns are added or removed. Modern Excel also has XLOOKUP combining benefits of both.
Q: How do you create a dynamic pivot table?
Convert your data range to an Excel Table (Insert > Table) before creating the pivot table. Tables automatically expand when new rows are added, so the pivot table includes new data when refreshed. Alternatively, use named ranges with dynamic formulas like =OFFSET($A$1,0,0,COUNTA($A:$A),COUNTA($1:$1)). For automatic refresh, use VBA or Power Query to refresh pivot tables when the workbook opens. This ensures dashboards always show current data without manual range updates.
Q: Whatâs the difference between absolute and relative cell references?
Relative references (A1) change when formulas are copied. Copying =A1+B1 down becomes =A2+B2. Absolute references ($A$1) stay fixed when copied. Mixed references like $A1 (fixed column, relative row) or A$1 (relative column, fixed row) lock one dimension. Use absolute references for constants like tax rates or conversion factors that shouldnât change. Use relative references for formulas that should adjust to each rowâs data.
Q: How do you handle missing or inconsistent data in Excel?
Use Find & Replace for standardizing inconsistent text (different date formats, spelling variations). Remove duplicates via Data > Remove Duplicates. Fill missing values with =IFERROR() or =IFNA() wrapping formulas. Use conditional formatting to highlight blanks or outliers for manual review. Power Query provides robust data cleaning tools including trimming whitespace, changing case, and replacing values across columns. For numeric gaps, decide between filling with average, median, previous value, or flagging as missing based on business context.
Tableau Dashboards & Visualization
Whatâs the difference between data extract and live connection in Tableau?
Live connections query the database in real-time showing current data but performance depends on database speed. Extracts create static snapshots (.hyper files) stored locally or on Tableau Server offering faster performance since data is optimized for Tableauâs engine. Use live connections for real-time dashboards with frequently changing data. Use extracts for large datasets, slow databases, or when you need offline access. Schedule extract refreshes (hourly, daily) to balance freshness with performance.
Explain Level of Detail (LOD) expressions and when to use them.
LOD expressions perform calculations at different granularities than the view level. FIXED calculates at specified dimensions ignoring filters. INCLUDE adds dimensions to the view level. EXCLUDE removes dimensions from calculation. Example: {FIXED [Customer]: SUM([Sales])} calculates total sales per customer regardless of date filters in the view. Use LOD when you need calculations independent of visualization granularity, like comparing individual values to category averages or calculating cohort metrics.
How do you optimize dashboard performance in Tableau?
Use extracts instead of live connections for large datasets. Limit data with filters at the data source level before bringing into Tableau. Avoid complex calculations in views, pre-calculate in data source when possible. Use dashboard actions instead of quick filters which load all data. Hide unused fields and remove unnecessary sheets. Optimize calculations by avoiding nested LODs and row-level calculations on large datasets. Use Tableauâs Performance Recorder to identify bottlenecks and optimize specific worksheets causing slowness.
Business Insights & Communication
Q: How do you approach analyzing a new dataset?
Start with exploratory data analysis (EDA): check data types, distributions, missing values, and outliers. Calculate basic statistics (mean, median, range) for numeric columns and frequency counts for categorical data. Visualize distributions with histograms and relationships with scatter plots. Identify data quality issues early including duplicates, inconsistent formats, or logical errors. Understand business context by asking stakeholders what decisions this data informs. Document assumptions and data limitations before detailed analysis.
Q: Explain how you would calculate month-over-month growth rate.
Month-over-month growth compares current month to previous month. Formula: (Current Month - Previous Month) / Previous Month * 100. In SQL, use LAG window function: SELECT month, revenue, (revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100 as mom_growth FROM sales. Handle edge cases like division by zero when previous month has zero revenue. For seasonality, also calculate year-over-year growth comparing same months across years.
Q: How do you communicate findings to non-technical stakeholders?
Lead with business impact, not technical methods. Start with the recommendation or key insight, then support with data. Use simple visualizations avoiding complex charts non-technical audiences struggle to interpret. Avoid jargon like âregression analysisâ instead say âfound relationship between X and Y.â Provide context comparing to targets, previous periods, or industry benchmarks. Anticipate questions about methodology but keep technical details brief unless asked. Focus on actionable next steps rather than just reporting numbers.
Q: What metrics would you track for an e-commerce business?
Revenue metrics: total revenue, average order value (AOV), revenue per customer. Customer metrics: customer acquisition cost (CAC), customer lifetime value (CLV), retention rate, churn rate. Conversion metrics: conversion rate, cart abandonment rate, checkout completion rate. Traffic metrics: website visitors, traffic sources, bounce rate. Product metrics: best sellers, inventory turnover, product return rate. Profitability: gross margin, contribution margin by product. Choose metrics aligned with business goals like growth versus profitability focus.
â ď¸ Common mistake: Creating dashboards without understanding who uses them and for what decisions. Iâve seen analysts build complex visualizations nobody looks at because they didnât validate requirements with actual users first.
SQL & Analytics Challenges
20 Practice Questions
1. Which JOIN returns all rows from both tables?
- INNER JOIN
- LEFT JOIN
- FULL OUTER JOIN
- CROSS JOIN
2. In Excel, what does $ indicate in a cell reference?
- Currency formatting
- Absolute reference (fixed when copied)
- Formula error
- Named range
3. Which function finds duplicates in SQL?
- DISTINCT
- GROUP BY with HAVING COUNT(*) > 1
- UNIQUE
- DEDUPLICATE
4. What does VLOOKUPâs FALSE parameter do?
- Exact match only
- Approximate match
- Case-insensitive search
- Returns error if not found
5. In Tableau, whatâs faster for large datasets?
- Live connection
- Data extract (.hyper file)
- Both same speed
- Depends on visualization type
6. What calculates running totals in SQL?
- GROUP BY
- HAVING
- SUM() OVER (ORDER BY date)
- AGGREGATE
7. Which Excel function combines benefits of VLOOKUP and INDEX-MATCH?
- HLOOKUP
- XLOOKUP
- LOOKUP
- MATCH
8. What does FIXED LOD expression do in Tableau?
- Fixes errors in data
- Calculates at specified dimensions ignoring filters
- Locks visualization from editing
- Creates static dashboard
9. Formula for month-over-month growth rate?
- Current / Previous
- (Current â Previous) / Previous * 100
- Current â Previous
- (Current + Previous) / 2
10. Which handles missing values in Excel formulas?
- ISERROR
- IFERROR or IFNA
- ISBLANK
- ERROR.TYPE
11. Whatâs the difference between dimensions and measures in Tableau?
- Dimensions are categorical, measures are numeric
- Both are the same
- Dimensions are calculated fields
- Measures are filters only
12. Which SQL clause filters aggregated results?
- WHERE
- HAVING
- GROUP BY
- ORDER BY
13. In pivot tables, what happens when you refresh?
- Creates new pivot table
- Updates with latest data from source
- Resets all formatting
- Deletes filters
14. What identifies bottlenecks in SQL queries?
- SELECT statement
- EXPLAIN or execution plan
- DESCRIBE
- SHOW TABLES
15. Customer Lifetime Value (CLV) compares to what?
- Customer Acquisition Cost (CAC)
- Average Order Value
- Churn Rate
- Conversion Rate
16. Which Excel feature automatically expands when adding data?
- Named Range
- Table (Insert > Table)
- Cell Reference
- Formula
17. What does ROW_NUMBER() do in SQL?
- Counts total rows
- Assigns sequential number to each row in partition
- Returns row index
- Filters rows
18. In Tableau, what improves dashboard performance?
- Add more worksheets
- Filter at data source level
- Use only live connections
- Increase refresh rate
19. INDEX-MATCH advantages over VLOOKUP?
- Faster performance
- Lookup column can be anywhere, handles inserts
- Easier syntax
- Works with more data types
20. What calculates difference between consecutive rows in SQL?
- DIFF function
- Current value â LAG(value) OVER (ORDER BY)
- DELTA
- PREVIOUS
â FAQ
đ Do I need to know both Tableau and Power BI?
Learn one deeply rather than both superficially. Tableau and Power BI share similar concepts so mastering one makes learning the other easier. Check job postings in your target market to see which is more common. Tableau dominates in tech and finance, Power BI in companies with Microsoft ecosystems. Concepts like dashboards, filters, and calculated fields transfer between tools.
đť How advanced should my SQL skills be?
Master joins, aggregations, subqueries, and window functions at minimum. Know when to use HAVING versus WHERE, how to optimize queries with indexes, and how to handle NULL values. Practice writing queries for business scenarios like calculating retention rates, finding top customers, or analyzing trends. Advanced topics like query optimization and execution plans help for senior roles.
đ Is Excel still important for data analysts?
Yes, Excel remains essential for quick analysis, ad-hoc reporting, and stakeholder communication. Many business users prefer Excel for its familiarity. Master pivot tables, VLOOKUP/INDEX-MATCH, conditional formatting, and basic formulas. Power Query adds data transformation capabilities rivaling dedicated ETL tools. Excel complements SQL and visualization tools rather than replacing them.
đŻ What statistics knowledge do data analysts need?
Understand descriptive statistics (mean, median, mode, standard deviation), correlation versus causation, and basic probability. Know when metrics are statistically significant versus random variation. Familiarity with A/B testing, confidence intervals, and hypothesis testing helps for product analytics roles. Deep statistical modeling is typically data scientist territory, not analyst roles.
đ How do data analyst interviews differ from data scientist interviews?
Analyst interviews focus on SQL, business metrics, and communication over machine learning and programming. Expect more scenario-based questions about calculating KPIs, building dashboards, and presenting insights. Less emphasis on algorithms and statistical modeling, more on practical data manipulation and visualization. Python/R knowledge helps but isnât always required for analyst roles like it is for data scientists.
Final Thoughts
Mastering data analyst interview questions requires balancing technical skills with business understanding. The best preparation combines practicing SQL queries on real datasets, building dashboards that answer actual business questions, and learning to communicate insights clearly to non-technical audiences. Focus on understanding why certain analyses matter to business decisions rather than just memorizing tool features.
Companies value data analysts who translate data into action, not just create reports. Your preparation should include working with messy real-world data, calculating business metrics like retention and growth rates, and presenting findings effectively. Build a portfolio demonstrating complete analysis projects from data extraction through insight communication rather than just showcasing technical skills in isolation.
â ď¸ Disclaimer: The interview strategies, sample answers, and negotiation tips provided in this guide are for educational purposes only. Hiring decisions are subjective and vary by company and industry. While these strategies are based on professional HR standards, they do not guarantee a specific job offer or result.








