ML 모델
BurstPick은 6개 카테고리에 걸쳐 22개의 ML 모델을 제공합니다. 전부 Apple CoreML과 Neural Engine으로 기기에서 실행됩니다. 사진이 Mac을 떠나는 일은 없습니다. 앱에 포함되거나, 선택 시 다운로드됩니다.
이미지 품질 평가
선명도, 노이즈, 노출, 지각적 선명함 등 기술적 품질을 기준으로 각 사진을 평가합니다. 버스트 클러스터 내에서 사진 순위를 매기는 데 사용됩니다.
Heuristic (Laplacian + Luma)
번들Instant scoring using Accelerate vDSP/vImage. Measures sharpness, exposure, noise, and eye closure. Cannot judge composition or semantic quality. Best for fast initial culling passes.
TOPIQ NR
No-reference IQA using ResNet50 backbone. Good general-purpose technical quality scores. Good balanced option for quality assessment.
MUSIQ (KonIQ)
Multi-scale transformer trained on KonIQ-10k real-world distortions. Strong on natural photos with better perceptual alignment than TOPIQ.
MANIQA
NTIRE 2022 IQA Challenge winner. Multi-dimension attention captures fine perceptual differences. Most accurate but largest in category.
NIMA (MobileNet)
Neural Image Assessment trained on AVA (250K aesthetic ratings). Outputs 10-class probability distribution. Compact MobileNet backbone — fastest aesthetic quality model.
미적 점수
구도, 색상 조화, 시각적 균형을 기준으로 미적 매력을 평가합니다. 대규모 선호도 데이터로 학습했습니다.
LAION Aesthetic v1
번들Lightweight linear probe on CLIP embeddings — near-zero overhead if CLIP is loaded. Trained on LAION aesthetic ratings. Good default aesthetic scorer.
ViT-B/16 Aesthetic
Standalone ViT-B/16 fine-tuned on AVA dataset (250K human aesthetic ratings). More nuanced aesthetic judgment than LAION probe. Independent of CLIP.
이미지 임베딩
사진을 벡터로 변환해 유사도 기반 클러스터링에 사용합니다. 버스트 그룹화와 중복 감지에 쓰입니다.
Apple Vision FeaturePrint
번들Built into macOS — zero download, instant availability. Good general-purpose scene similarity. Best for speed-first workflows.
DINOv2 ViT-S/14
State-of-the-art self-supervised features (Meta, LVD-142M). Excellent visual similarity and scene structure. Recommended balanced choice.
CLIP ViT-B/32
Rich semantic understanding from multimodal training. Groups photos by content meaning. Required by LAION Aesthetic scorer. Best for diverse libraries.
얼굴 임베딩
얼굴 ID 벡터를 만들어 인물별로 클러스터링합니다.
EdgeFace-XS
Fastest option — lightweight 4 MB download. Good face grouping for most photos (LFW 99.73%). Best when speed is the priority.
EdgeFace-S
Good balance of speed and accuracy (LFW 99.82%, IJB-B 94.38%). Small download. Handles varied lighting well. Recommended balanced choice.
AdaFace IR-18
번들Strong on low-quality and challenging face crops via adaptive margin (CVPR 2022). LFW 99.82%. Good mid-tier choice.
AdaFace IR-50
Top-tier accuracy (LFW 99.82%, IJB-B 95.67%). Excels on difficult poses and low-quality crops. Best when face grouping precision is critical.
AuraFace v1
Large ResNet-100 backbone with permissive Apache 2.0 license. Choose mainly for licensing requirements.
GhostFaceNets
SOTA 2025 lightweight face recognition model. High performance with minimal computational overhead.
비전 언어 모델 (VLM)
자연어로 사진 내용을 파악합니다. 장면 설명이나 품질 판단 등 숫자 점수로는 안 되는 분석을 합니다.
Heuristic Estimate
번들Built-in fallback using heuristic image analysis (sharpness, exposure, noise, faces). No download required. Replace with a real VLM for improved results.
SmolVLM2 256M
Smallest VLM — fastest inference with minimal memory. Basic scene recognition and quality commentary. Best for quick screening on constrained hardware.
SmolVLM2 2.2B
Full-size SmolVLM with strong scene understanding and quality reasoning. More capable but slower than 256M variant.
FastVLM 0.5B
Apple FastVLM with FastViTHD hybrid encoder. Optimized for on-device speed with solid scene recognition. Recommended balanced VLM choice.
FastVLM 1.5B
Largest and most capable VLM. Deep scene understanding, nuanced quality reasoning, and detailed descriptions. Best when VLM quality is the top priority.
이미지 분류
장면, 객체 레이블을 태깅합니다. Apple Vision 프레임워크 기반.
Apple Vision Classification
번들Built-in macOS image classification using VNClassifyImageRequest. Fast, no download required. Provides scene and object tags for filtering.
