Publications
We introduce a bias-corrected "two-scale DNN" estimator that linearly combines two distributional nearest neighbors at different subsampling scales. The negative weights that arise from this combination are what let the estimator hit the optimal nonparametric convergence rate under fourth-order smoothness — something standard nearest-neighbor methods cannot achieve. We provide asymptotic normality results and practical inference via jackknife and bootstrap.
This paper proposes a Deep Difference-in-Differences (Deep-DiD) method that combines deep learning with the classic DiD framework to estimate heterogeneous treatment effects. Applied to content creator selection on digital platforms, the method identifies which creators generate the most engagement uplift, enabling data-driven influencer marketing decisions.
Working Papers
We decompose price elasticity estimation into intermediate prediction tasks solved with bagged nearest neighbors and nonparametric control functions. The result is a point-wise elasticity estimator that is both consistent under endogeneity and computationally scalable to large datasets via just-in-time compilation. We apply it to simulate equilibrium prices under a counterfactual federal cigarette tax increase.
Consumer response to marketing text depends on how easily messages can be understood. However, existing measures of reading difficulty still rely on surface features, such as average word and sentence length. We introduce word-by-word predictability – estimated with large language models – as a scalable, theory-grounded measure of processing ease.
We embed the full network graph directly into treatment effect estimation using a dual architecture: one neural network models heterogeneous effects as a function of covariates, while a graph convolutional network captures spillovers without imposing assumptions on how interference propagates. This lets the method handle complex interference patterns across diverse network topologies.
We develop a control function approach for estimating heterogeneous treatment effects over a continuous endogenous treatment — not just binary treatments or covariate-based heterogeneity. Applied to bullet chats (danmaku) on a video platform using a randomized "pre-set bullet chat" instrument, we find a non-monotonic effect: sparse chats reduce viewing, but denser chats increase it, driven by perceived popularity.