Claude Code Skills · 论文 · 图表绘制
seaborn
Seaborn 是基于 matplotlib 的统计绘图库,专门处理 DataFrame 数据与变量映射。内置箱线图、小提琴图、成对图、热力图等常见统计图表类型,能自动计算置信区间和分组统计量。论文写作中用于快速完成数据探索和
Statistical visualization with pandas integration. Use for quick exploration of distributions, relationships, and categorical comparisons with attractive defaults. Best for box plots, violin plots, pair plots, heatmaps. Built on matplotlib. For interactive plots use plotly; for publication styling use scientific-visualization.
- Repo
Chanw-research/claude-code-paper-writing- Slug
seaborn
SKILL.md
Seaborn Statistical Visualization
Overview
Seaborn is a Python visualization library for creating publication-quality statistical graphics. Use this skill for dataset-oriented plotting, multivariate analysis, automatic statistical estimation, and complex multi-panel figures with minimal code.
Design Philosophy
Seaborn follows these core principles:
- Dataset-oriented: Work directly with DataFrames and named variables rather than abstract coordinates
- Semantic mapping: Automatically translate data values into visual properties (colors, sizes, styles)
- Statistical awareness: Built-in aggregation, error estimation, and confidence intervals
- Aesthetic defaults: Publication-ready themes and color palettes out of the box
- Matplotlib integration: Full compatibility with matplotlib customization when needed
Quick Start
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Load example dataset
df = sns.load_dataset('tips')
# Create a simple visualization
sns.scatterplot(data=df, x='total_bill', y='tip', hue='day')
plt.show()
Core Plotting Interfaces
Function Interface (Traditional)
The function interface provides specialized plotting functions organized by visualization type. Each category has axes-level functions (plot to single axes) and figure-level functions (manage entire figure with faceting).
When to use:
- Quick exploratory analysis
- Single-purpose visualizations
- When you need a specific plot type
Objects Interface (Modern)
The seaborn.objects interface provides a declarative, composable API similar to ggplot2. Build visualizations by chaining methods to specify data mappings, marks, transformations, and scales.
When to use:
- Complex layered visualizations
- When you need fine-grained control over transformations
- Building custom plot types
- Programmatic plot generation
from seaborn import objects as so
# Declarative syntax
(
so.Plot(data=df, x='total_bill', y='tip')
.add(so.Dot(), color='day')
.add(so.Line(), so.PolyFit())
)
Plotting Functions by Category
Relational Plots (Relationships Between Variables)
Use for: Exploring how two or more variables relate to each other
scatterplot()- Display individual observations as pointslineplot()- Show trends and changes (automatically aggregates and computes CI)relplot()- Figure-level interface with automatic faceting
Key parameters:
x,y- Primary variableshue- Color encoding for additional categorical/continuous variablesize- Point/line size encodingstyle- Marker/line style encodingcol,row- Facet into multiple subplots (figure-level only)
# Scatter with multiple semantic mappings
sns.scatterplot(data=df, x='total_bill', y='tip',
hue='time', size='size', style='sex')
# Line plot with confidence intervals
sns.lineplot(data=timeseries, x='date', y='value', hue='category')
# Faceted relational plot
sns.relplot(data=df, x='total_bill', y='tip',
col='time', row='sex', hue='smoker', kind='scatter')
Distribution Plots (Single and Bivariate Distributions)
Use for: Understanding data spread, shape, and probability density
histplot()- Bar-based frequency distributions with flexible binningkdeplot()- Smooth density estimates using Gaussian kernelsecdfplot()- Empirical cumulative distribution (no parameters to tune)rugplot()- Individual observation tick marksdisplot()- Figure-level interface for univariate and bivariate distributionsjointplot()- Bivariate plot with marginal distributionspairplot()- Matrix of pairwise relationships across dataset
Key parameters:
x,y- Variables (y optional for univariate)hue- Separate distributions by categorystat- Normalization: "count", "frequency", "probability", "density"bins/binwidth- Histogram binning controlbw_adjust- KDE bandwidth multiplier (higher = smoother)fill- Fill area under curvemultiple- How to handle hue: "layer", "stack", "dodge", "fill"
# Histogram with density normalization
sns.histplot(data=df, x='total_bill', hue='time',
stat='density', multiple='stack')
# Bivariate KDE with contours
sns.kdeplot(data=df, x='total_bill', y='tip',
fill=True, levels=5, thresh=0.1)
# Joint plot with marginals
sns.jointplot(data=df, x='total_bill', y='tip',
kind='scatter', hue='time')
# Pairwise relationships
sns.pairplot(data=df, hue='species', corner=True)
Categorical Plots (Comparisons Across Categories)
Use for: Comparing distributions or statistics across discrete categories
Categorical scatterplots:
stripplot()- Points with jitter to show all observationsswarmplot()- Non-overlapping points (beeswarm algorithm)
Distribution comparisons:
boxplot()- Quartiles and outliersviolinplot()- KDE + quartile informationboxenplot()- Enhanced boxplot for larger datasets
Statistical estimates:
barplot()- Mean/aggregate with confidence intervalspointplot()- Point estimates with connecting linescountplot()- Count of observations per category
Figure-level:
catplot()- Faceted categorical plots (setkindparameter)
Key parameters:
x,y- Variables (one typically categorical)hue- Additional categorical groupingorder,hue_order- Control category orderingdodge- Separate hue levels side-by-sideorient- "v" (vertical) or "h" (horizontal)kind- Plot type for catplot: "strip", "swarm", "box", "violin", "bar", "point"
# Swarm plot showing all points
sns.swarmplot(data=df, x='day', y='total_bill', hue='sex')
# Violin plot with split for comparison
sns.violinplot(data=df, x='day', y='total_bill',
hue='sex', split=True)
# Bar plot with error bars
sns.barplot(data=df, x='day', y='total_bill',
hue='sex', estimator='mean', errorbar='ci')
# Faceted categorical plot
sns.catplot(data=df, x='day', y='total_bill',
col='time', kind='box')
Regression Plots (Linear Relationships)
Use for: Visualizing linear regressions and residuals
regplot()- Axes-level regression plot with scatter + fit linelmplot()- Figure-level with faceting supportresidplot()- Residual plot for assessing model fit
Key parameters:
x,y- Variables to regressorder- Polynomial regression orderlogistic- Fit logistic regressionrobust- Use robust regression (less sensitive to outliers)ci- Confidence interval width (default 95)scatter_kws,line_kws- Customize scatter and line properties
# Simple linear regression
sns.regplot(data=df, x='total_bill', y='tip')
# Polynomial regression with faceting
sns.lmplot(data=df, x='total_bill', y='tip',
col='time', order=2, ci=95)
# Check residuals
sns.residplot(data=df, x='total_bill', y='tip')
Matrix Plots (Rectangular Data)
Use for: Visualizing matrices, correlations, and grid-structured data
heatmap()- Color-encoded matrix with annotationsclustermap()- Hierarchically-clustered heatmap
Key parameters:
data- 2D rectangular dataset (DataFrame or array)annot- Display values in cellsfmt- Format string for annotations (e.g., ".2f")cmap- Colormap namecenter- Value at colormap center (for diverging colormaps)vmin,vmax- Color scale limitssquare- Force square cellslinewidths- Gap between cells
# Correlation heatmap
corr = df.corr()
sns.heatmap(corr, annot=True, fmt='.2f',
cmap='coolwarm', center=0, square=True)
# Clustered heatmap
sns.clustermap(data, cmap='viridis',
standard_scale=1, figsize=(10, 10))
Multi-Plot Grids
Seaborn provides grid objects for creating complex multi-panel figures:
FacetGrid
C
同一分类的其他项