2026 Mini ToGeDA workshop at KAIST

일시: 2026년 1월 7일- 8일 (수-목)
장소: 카이스트 대전 본원, 산업경영학동(E2-1), 1223호 1501호 (공동강의실)
초록: 위상수학 및 기하학을 데이터 분석에 활용하는 데에 관심있는 젊은 한국인 연구자들을 위한 교류의 장을 만들고자 합니다.

* Each of 50-min talks include 10-min Q&A time.

1월 7일 (Day 1)

13:00–13:25 행사등록 및 자유시간 (커피 및 다과)
13:25- 13:30 첫인사

13:30–14:20 임성현 What exactly does persistent homology do?
14:20–14:30 Break

14:30–15:20 유준원 Persistent Homology as a Lens for Understanding Graphs and Latent Spaces in Representation Learning
15:20–15:30 Break

15:30–16:20 최병찬 Persistent Homology with Path-Representable Distances on Graph Data: Injection and Total Persistence Difference
16:20–16:30 Break

16:30–17:20 Flash Talks (15-min each, including Q&A)

  • 강동우: Oriented Grassmannian Embeddings for Narrow Cycles
  • 김현규: Support Estimation with Topological guarantee
  • 한영웅: A Two-Sample Test on Weighted Persistence Intensity Functions in Topological Data Analysis

저녁식사 (Dinner)


1월 8일 (Day 2)

8:30–8:50 자유시간 (커피 및 다과)

08:50–09:40 김지수: Persistent Homology의 통계적 추정과 기계 학습에의 응용
09:40–09:50 Break

09:50–10:40 임선혁 G-Gromov-Hausdorff Distances and Equivariant Topology
10:40–10:50 Break

10:50–11:40 임성현 Universal topological statistics on triangulable spaces
11:40–11:45 Group Photo

11:45-13:10 점심식사 (Lunch)

13:10–14:00 이강주:  Combinatorics and Optimization for Mapper
14:00-14:10 Break

14:10–15:00 김근수: 위상적 데이터 분석의 탐구: 2025년 수행 연구를 중심으로
15:00–15:10 Break

15:10–16:00 최성진: Extending the Domain of Topological Deep Learning: Symmetric Simplicial Sets and Graded Posets
16:00–16:10 Break

16:10–17:00 성원: Interleaving distance as an Edit Distance


연사 (Speakers): 강동우 (서울대학교), 김근수 (규슈대학교), 김지수 (서울대학교), 김현규 (서울대학교), 성원 (카이스트), 유준원 (포스텍), 이강주 (서울대학교), 임선혁 (성균관대학교), 임성현(Uzu Lim) (Queen Mary University of London), 최병찬 (포스텍), 최성진 (포스텍), 한영웅 (서울대학교)


초록 (Abstracts)

강동우. Oriented Grassmannian Embeddings for Narrow Cycles

Smooth closed orientable submanifolds embedded in Euclidean space may contain narrow cycles that arise along directions of large normal curvature. We study the embedding of such a manifold into a scaled oriented Grassmannian bundle via the Gauss map and use the distance induced by this embedding as a new extrinsic metric on the manifold. For suitable choices of the scaling parameter, this embedding decreases the normal curvature along the directions that create narrow cycles and, in the hypersurface case, increases the distance between antipodal points of a narrow cycle at fixed volume. As a consequence, these cycles appear at larger radii in the Čech or Vietoris–Rips filtrations built from this distance, yielding longer bars in the resulting persistent homology.


김근수. 위상적 데이터 분석의 탐구: 2025년 수행 연구를 중심으로

본 발표는 세 가지 흐름을 중심으로 TDA의 교육적·연구적 관점을 제시한다.

먼저, 교육적 관점에서 kernel density estimation(KDE)을 기반으로 한 통계적 추론과, persistent homology를 이용한 위상적 추론을 비교하여 두 방법이 데이터의 구조를 이해하는 방식을 대비한다.

이어, 강연자의 첫 번째 연구 업데이트—서브클러스터의 발견, 정량적 비교, 그리고 푸리에 기반 해석—를 소개하고, 더 나아가 원의 Vietoris–Rips complex의 호모토피 타입 분류의 활용과 위상적 시계열 분석에서의 symplectic invariant와의 연계 가능성을 논의하고자 한다.

마지막으로, 현재 진행 중인 Nonnegative Matrix Factorization with topological regularization에 대한 연구를 소개하며, 발표를 마무리한다.


김지수. Persistent Homology의 통계적 추정과 기계 학습에의 응용

본 발표에서는 위상 자료 분석(Topological Data Analysis, TDA)의 핵심 도구인 persistent homology에 대한 통계적 추정 방법과 이를 기계 학습에 응용하는 다양한 접근법을 소개합니다. persistent homology는 자료를 여러 해상도에서 분석하여 지속적으로 나타나는 위상적 구조를 포착함으로써, 자료에 내재된 다중 스케일 위상 정보를 효과적으로 요약하는 방법론입니다. 최근에는 이러한 위상 정보의 안정성(stability)과 통계적 성질에 대한 이론적 연구가 축적됨과 동시에, 기계 학습 문제에서의 실질적 활용 가능성 또한 활발히 탐구되고 있습니다.

발표의 전반부에서는 persistent homology의 기본 개념을 간략히 살펴본 뒤, 확률적 자료로부터 persistent homology를 통계적으로 추정하는 문제를 다룹니다. 특히 자료의 임의성(randomness)으로 인해 발생하는 persistent homology의 변동성을 분석하고, bottleneck 거리에서의 안정성(stability) 결과를 바탕으로 이러한 불확실성을 신뢰 집합(confidence set)의 형태로 정량화하는 방법을 소개합니다. 이를 통해 관측된 persistent homology로부터 통계적으로 유의미한 위상 특성을 선별하는 절차를 논의합니다.

후반부에서는 persistent homology를 기계 학습에 적용하는 두 가지 대표적 접근법을 살펴봅니다. 첫째는 특성화(featurization)로, 복잡한 기하·위상 구조를 지닌 persistent homology를 유클리드 벡터나 함수 형태로 변환하여 기존의 기계 학습 알고리즘에 통합하는 방법입니다. 둘째는 평가(evaluation)로, 자료나 학습된 모형이 보존하는 위상 구조를 분석함으로써 데이터 표현이나 모형의 품질을 평가하는 접근입니다. 실제 적용 사례들을 통해 이러한 방법들이 갖는 이론적 근거와 활용 성과를 살펴보고, persistent homology와 기계 학습 간의 접점 및 향후 연구 방향을 조명합니다.


김현규. Support Estimation with Topological guarantee

Support estimation is a fundamental tool for understanding the topological and geometric properties of the underlying structure from which the data are derived. Despite its importance, there are only a few statistical methods for reconstructing the topological structure of the support. In this work, we establish the topological consistency of a kernel density estimator based support estimator for both fixed and random data settings. Our results offer more robust topological inference methods compared to existing methods.


성원. Interleaving distance as an Edit Distance

The concept of edit distance, which dates back to the 1960s in the context of comparing word strings, has since found numerous applications with various adaptations in computer science, computational biology, and applied topology. By contrast, the interleaving distance, introduced in the 2000s within the study of persistent homology, has become a foundational metric in topological data analysis. In this work, we show that the interleaving distance on finitely presented single- and multi-parameter persistence modules can be formulated as an edit distance. The key lies in clarifying a connection between the Galois connection and the interleaving distance, via the established relation between the interleaving distance and free presentations of persistence modules. In addition to offering new perspectives on the interleaving distance, we expect that our findings facilitate the study of stability properties of invariants of multi-parameter persistence modules. As an application of the edit formulation of the interleaving distance, we present an alternative proof of the well-known bottleneck stability theorem.


유준원. Persistent Homology as a Lens for Understanding Graphs and Latent Spaces in Representation Learning

Recent advances in topological data analysis (TDA) have shown that persistent homology can serve as a powerful tool for extracting structural information from complex data. In this talk, I present a series of studies that integrate TDA into both graph reasoning and representation learning, highlighting how topological features contribute interpretability and performance improvement to modern deep learning models. I first introduce PHLP, a simple and interpretable method for link prediction, followed by its extension to knowledge graph completion. On the representation learning side, I show that latent-space topology helps quantify structural discrepancies during knowledge distillation and improves the transfer of geometric information. I also present our work on topological alignment for vision–language embeddings. Together, these studies illustrate how topology offers a unifying lens for understanding and improving learning across diverse data types and model settings.


이강주 Combinatorics and Optimization for Mapper

Mapper is a network-based visualization technique in Topological Data Analysis (TDA) that has been used in a variety of applications. It constructs a simplicial complex, typically represented as a graph, by taking the nerve of a refinement of the pullback cover induced by a lens function and a cover of its image. A major challenge of the Mapper algorithm is that it requires tuning several parameters to generate a “nice” Mapper graph. In principle, for a given dataset, Mapper can generate any graph as output with an appropriate choice of parameters, reflecting the combinatorial richness of the Mapper construction. Focusing on the cover parameter, we develop an algorithm that iteratively splits the cover according to a statistical test of normality, thereby optimizing the structure of the resulting Mapper graph to produce more insightful visualizations.


임선혁 G-Gromov-Hausdorff Distances and Equivariant Topology

In this talk, we introduce the G-Gromov-Hausdorff distance ($d_{GH}^{G}$) for compact metric spaces equipped with finite group actions and establish stability results using G-equivariant Vietoris-Rips metric thickenings. We prove that $d_{GH}^{G}$ is bounded below by the G-homotopy type distance and G-persistent indices, enabling new equivariant rigidity and finiteness theorems for classes of Riemannian manifolds. Finally, we apply this framework to derive a G-equivariant quantitative Borsuk-Ulam theorem and determine sharp bounds for the $\mathbb{Z}_{2}$-Gromov-Hausdorff distance between spheres equipped with geodesic, Euclidean, and $l^{\infty}$ metrics.


임성현 (Day 1). What exactly does persistent homology do?

Persistent homology (PH) is the central tool of topological data analysis, and it is a canonical multi-scale method to assign homology groups to a discrete metric space. Yet PH is often carelessly invoked with a vague promise of capturing “the shape of data,” even though it measures the creation and destruction of holes. PH is also known to capture non-hole information like curvature and density distribution, but phenomena like this should not be mystified, and deserves a precise mathematical analysis. In this talk, I will critically re-examine the standard PH pipeline and highlight gaps in our understanding. I will cover recent advancements in the theory and applications of PH, and propose a more precise framework for the “when and why” of persistent homology.


임성현 (Day 2). Universal topological statistics on triangulable spaces

A persistence diagram (PD) summarises the holes and cavities present in a dataset, and it is crucial to distinguish the signal from noise in a PD. A hypothesis test for identifying genuine topological signal was proposed by Bobrowski and Skraba, and this was made possible due to a universal statistical distribution of multiplicative persistence. In this talk, I will explain an ongoing collaboration with Bobrowski and Skraba on how the universality law does not only apply to data distributions on a Euclidean space, but also a wide class of triangulable spaces such as manifolds and algebraic varieties. Furthermore, the universality law is also found in non-PD statistics, such as nearest-neighbour distance ratios. This together establishes a statistical theory of topological and geometric quantities with applications in TDA and intrinsic dimension estimation.


최병찬. Persistent Homology with Path-Representable Distances on Graph Data: Injection and Total Persistence Difference

Persistent homology (PH) has been widely applied to graph data to extract topological features. However, little attention has been paid to how different distance functions on a graph affect the resulting persistence diagrams and their interpretations. In this talk, we define a class of distances on graphs, called path-representable distances, and investigate structural relationships between their induced persistent homologies. In particular, we identify a nontrivial injection between the 1-dimensional barcodes induced by two commonly used graph distances: the unweighted and weighted shortest-path distances. We formally establish sufficient conditions under which such embeddings arise, focusing on a subclass we call cost-dominated distances. The injection property is shown to hold in 0- and 1-dimensions, while we provide counterexamples for higher-dimensional cases. To make these relationships measurable, we introduce the total persistence difference (TPD), a new topological measure that quantifies changes between filtrations induced by cost-dominated distances on a fixed graph. We prove a stability result for TPD when the distance functions are admitted to a partial order and apply the method to the SNAP EU Research Institution E-Mail dataset. TPD captures both periodic patterns and global trends in the data and shows stronger alignment with classical graph statistics compared to an existing PH-based measure applied to the same dataset. 


최성진. Extending the Domain of Topological Deep Learning: Symmetric Simplicial Sets and Graded Posets

Topological Deep Learning extends neural networks to higher-order structures like hypergraphs, simplicial complexes, regular CW complexes, and combinatorial complexes. However, the field currently lacks a unified mathematical definition for these “higher-order domains.” In this talk, I establish a rigorous definition of higher-order domains based on element dimension and containment relations, thereby unifying previous examples under a single framework. I show that these domains form topological spaces that naturally admit a structure of upper and lower adjacencies. I then introduce two new classes of higher-order domains: symmetric simplicial sets and graded posets. I will discuss how these structures facilitate meaningful extensions of graph-based methods, specifically neural sheaf diffusion and the Weisfeiler-Lehman test.


한영웅. A Two-Sample Test on Weighted Persistence Intensity Functions in Topological Data Analysis

The intensity function, defined as the density of the expected measure of a persistence diagram with respect to the Lebesgue measure, serves as a key summary representation of the probability distribution of persistence diagrams and plays a crucial role in statistical inference for topological data analysis (TDA). Although several methods have been proposed for estimating intensity functions, no statistical procedure has been developed for two-sample testing of their homogeneity. Moreover, theoretical power analysis for a two-sample test in TDA has not been explored, mainly due to the intrinsic non-Euclidean nature of the space of persistence diagrams. We propose a kernel-based permutation test for assessing the homogeneity of intensity functions and provide a theoretical analysis of its power, leading to a minimax optimal bandwidth of the kernel. 

To address the fact that the optimal bandwidth is, in fact, not directly accessible, we also adopt the bandwidth aggregation framework and analyze its testing power.

Simulation studies and real-data analysis demonstrate that our test is valid and achieves high empirical power.