After computing a battery of metrics across fidelity, utility, and privacy dimensions, the raw numbers can be overwhelming. A table full of scores might contain all the information, but it rarely tells the full story in an accessible way. Effective visualization transforms these numerical results into clear, interpretable insights, making it easier to understand the strengths and weaknesses of a synthetic dataset and communicate these findings to stakeholders. The goal is not just to present data, but to guide interpretation and support informed decisions about whether a synthetic dataset is fit for purpose.
The first step is selecting the appropriate chart type for the specific metric or comparison you want to illustrate. Different visualizations excel at highlighting different aspects of the data.
Comparing distributions is fundamental to fidelity assessment. While univariate comparisons are straightforward, visualizing multivariate relationships requires more attention.
Side-by-side or overlaid histograms/density plots immediately show differences in shape, center, and spread for individual features.
Overlaid density histograms comparing the distribution of the 'Age' feature in real and synthetic datasets.
Heatmaps provide an intuitive way to compare correlation matrices. Place the heatmap for the real data next to the heatmap for the synthetic data, using the same color scale. Differences in patterns immediately highlight where the synthetic data fails to capture dependencies.
Side-by-side heatmaps visualizing the correlation matrices of real and synthetic data. Consistent color scaling allows for direct comparison of dependency structures.
ML utility metrics often involve comparing the performance of models trained on different data sources.
Bar charts are effective for comparing key performance indicators (KPIs) like accuracy, F1-score, or AUC obtained using different training regimes (TSTR, TRTR, TRTS). Error bars representing confidence intervals or standard deviations across multiple runs add statistical context.
Comparison of AUC scores for two different models across Train-Real/Test-Real (baseline), Train-Synthetic/Test-Real (TSTR utility), and Train-Real/Test-Synthetic (TRTS fidelity) scenarios.
Comparing feature importance rankings helps assess whether models trained on synthetic data learn similar patterns as those trained on real data. Side-by-side horizontal bar charts can effectively display this comparison.
Side-by-side horizontal bar charts comparing feature importance scores derived from models trained on synthetic versus real data.
Privacy evaluations often involve metrics derived from attack simulations.
The performance of an MIA classifier is typically visualized using an ROC curve or a Precision-Recall curve. The Area Under the Curve (AUC) provides a single summary statistic, often compared against the baseline of 0.5 (random guessing) using a bar chart.
ROC curve illustrating the trade-off between true positive and false positive rates for a Membership Inference Attack classifier. The diagonal dashed line represents random guessing.
Metrics like Distance to Closest Record (DCR) can be visualized using histograms or density plots, comparing the distribution of minimum distances for real records versus synthetic records to their nearest neighbors in the training data. Lower distances for synthetic records might indicate potential privacy leakage.
Often, a single visualization isn't enough. Combining multiple plots into a cohesive report or dashboard provides a holistic view.
For comparing multiple synthetic datasets or generation methods across several normalized metrics (e.g., a fidelity score, a utility score, a privacy score), radar charts offer a compact visual summary. However, ensure the axes are clearly labelled and avoid plotting too many datasets or metrics, which can make the chart unreadable.
Radar chart comparing two synthetic data generation models (Model A and Model B) across five normalized quality metrics. Larger areas generally indicate better performance on those axes.
By thoughtfully selecting, designing, and presenting visualizations, you can transform complex evaluation results into compelling evidence that effectively communicates the quality and suitability of your synthetic data.
© 2025 ApX Machine Learning