### 2.1 Localization precision

For a quantitative illustration, suppose an ideal zero-center donut excitation and a background-free condition with only Poissonian noise. Representing localization estimations with confidence intervals (Fig. 1a), the width of confidence interval for two-photon fluorescence localization is 1/2 of width for single-photon fluorescence, demonstrating improved localization precision.

To explicitly evaluate the improvement, Cramér-Rao Bound (CRB) is calculated for maximum localization precision of 2p-MINFLUX [9]. Modification is made on the Poissonian mean \(\lambda\) for two-photon fluorescence considering nonlinear effect:

$$\begin{aligned}&\lambda _{1p}\left( \vec {r}_{f}\right) =f_{1} I_{1p}\left( \vec {r}_{f}\right) \end{aligned}$$

(1a)

$$\begin{aligned}&\lambda _{2p}\left( \vec {r}_{f}\right) =f_{2} I_{2p}^{2}\left( \vec {r}_{f}\right) \end{aligned}$$

(1b)

where \(\vec {r}_{f}\) is the fluorophore position, \(f_{1}\) and \(f_{2}\) stand for, for simplicity, factors corresponding to absorption cross-section of fluorophore, quantum yield and collection efficiency of the system, and \(I_\mathrm{{1p}}\) and \(I_\mathrm{{2p}}\) are point spread functions (PSF) of donut excitations. All other parameters in the model are kept unchanged.

CRB is expressed in an intricate general form of \(\frac{\partial {\lambda }}{\partial x}\) and \(\frac{\partial {\lambda }}{\partial y}\). As we show in Code File 1 (Ref. [25]), for the typical four-point targeted coordinate pattern (TCP) [9,10,11,12,13], at the origin where localization precision is highest (*i.e.*, minimum CRB), CRB can be expressed explicitly as [9]:

$$\begin{aligned}&CRB_{1p}(\overrightarrow{0}) =\frac{L}{2\sqrt{2N}} \frac{s}{1-\frac{L^{2} \ln 2}{fwhm_{1p}^{2}}} \end{aligned}$$

(2a)

$$\begin{aligned}&CRB_{2p}(\overrightarrow{0}) =\frac{L}{4\sqrt{2N}} \frac{s}{1-\frac{L^{2} \ln 2}{fwhm_{2p}^{2}}} \end{aligned}$$

(2b)

where *L* is the diameter of TCP circle, *N* is number of detected photons, *fwhm* is the full width at half maximum of the excitation PSF, and factor \(s=\sqrt{\left( \frac{1}{SBR}+1\right) \left( \frac{3}{4SBR}+1\right) }\), where *SBR(L)* is the (median) signal-to-background ratio (dependent on *L*) [9, 12]. Since \({L} \ll {fwhm_{1p}} \le {fwhm_{2p}}\), we can obtain precision increase slightly larger than two-fold:

$$\begin{aligned} CRB_{2p}(\overrightarrow{0})\le \frac{1}{2} CRB_{1p}(\overrightarrow{0}) \end{aligned}$$

(3)

The reason for halving of CRB lies in that a factor of 2 appears when the square of \(I_{2p}\) is differentiated:

$$\begin{aligned}&\frac{\partial \lambda _{1p}}{\partial x}=\frac{\partial \left( f_{1} I_{1p}\right) }{\partial x}=f_{1} \frac{\partial I_{1p}}{\partial x} \end{aligned}$$

(4a)

$$\begin{aligned}&\frac{\partial \lambda _{2p}}{\partial x}=\frac{\partial \left( f_{2} I_{2p}^2\right) }{\partial x}=2f_{2}I_{2p} \frac{\partial I_{2p}}{\partial x} \end{aligned}$$

(4b)

and that in CRB formula (see Eq.(S26) in [9]), the denominator has one more power of the above partial derivatives than the nominator, resulting in an additional factor of 2 in the denominator.

Maximum localization precision is compared for single-photon and two-photon MINFLUX with respect to *N* (Fig. 1b). For *L* = 50 nm, *SBR* = 3, *N* = 100, and single-photon and two-photon excitation wavelengths set to 647 nm and 800 nm, \(CRB_{2p}\) and \(CRB_{1p}\) are 1.15 nm and 2.31 nm respectively, with CRB enhancement ratio \(R_{CRB}\) = \({CRB_{1p}}\)/\({CRB_{2p}}\) = 2.01. In addition, 2p-MINFLUX possesses same or slightly higher localization precision with only 1/4 photons compared to 1p-MINFLUX (\(N_{2p}\) = 100 versus \(N_{1p}\) = 400, or \(N_{2p}\) = 400 versus \(N_{1p}\) = 1600).

Dependence on CRB of possible changes of parameters are considered (Fig. 2). Quite contrary to intuition, CRB decreases (*i.e.* localization precision increases) as excitation wavelength increases (Fig. 2a), which is different from traditional localization methods, where precision is proportional to excitation wavelength. CRB changes only slightly with different excitation wavelength, providing same *SBR* and *L*. We set single-photon excitation wavelength as 647 nm since red/crimson dyes were most commonly used in previous works [9,10,11,12,13], and two-photon excitation wavelength as 800 nm corresponding to these dyes [23, 24]. Longer wavelength with less phototoxicity such as 1280 nm [26] may also be considered; note that with 1280 nm excitation, *SBR* may be compromised due to increased background from an enlarged PSF and decreased signal from extended intensity minima.

With \(L_0\) = 50 nm fixed, improved *SBR* results in increased localization precision (i.e., decreased CRB) (Fig. 2b). If *SBR* decreases to \(\le\) 1, then precision decreases sharply, due to the inverse proportion of *SBR* to CRB as in factor *s*. Thus circumstances with *SBR* \(\le\) 1 should be avoided as much as possible (*SBR* \(\le\) 1 is not a good parameter for any imaging technique).

We then consider the combined influence on CRB of *SBR* and *L* (Fig. 2c, d). *SBR* decreases as *L* decreases (Fig. 2d). Given a fixed *SBR* at \(L_0\) = 50 nm, there is a lower bound of CRB and a corresponding optimal *L* (denoted as \(L_{opt}\)) (Fig. 2c). We argue that given the limited *SBR* as MINFLUX uses donut minimum, decrease of *L* below 50 nm and to \(L_{opt}\) is not fruitful as it first seems. Note that *SBR* is not constant; median *SBR* is used to its distribution and ranges around 1.4 to 4.2 for biological samples at *L* = 50 nm [12]. For 1p-MINFLUX, median *SBR* \(\approx\) 0.81 (Fig. 2d) at \(L_{opt}\), which suggests that *SBR* is \(\le\) 0.81 in 50 percent localizations and thus attainable CRB is impaired. For 2p-MINFLUX, *SBR* decreases more rapidly with *L* compared to 1p-MINFLUX (Fig. 2d), limiting further decrease of *L* (Fig. 2c). To make direct comparison, same *L* = 50 nm and median *SBR* = 3 are used for both 1p-MINFLUX and 2p-MINFLUX.

CRB across 2D xy-plane for one-photon and two-photon MINFLUX is compared for *L* = 50 nm (Fig. 3). Of most interest is only CRB in center region of TCP circle instead of whole xy-plane, since a previous round of iterative 1p-MINFLUX with *L* = 100 nm already localizes the fluorophore with single-digit nanometer precision (*e*.*g*. 3.3 nm) [12]. A center region with radius of 3.3 nm has consistently \(R_{CRB}\) \(\ge\) 1.92. (Note that in iterative 2p-MINFLUX, a second-last round with *L* = 100 nm would likely obtain precision better than 3.3 nm as well, resulting in further improved \(R_{CRB}\).) CRB increases (precision decreases) with increasing *r*; 2p-MINFLUX has faster increase of CRB than 1p-MINFLUX (Additional file 1: Figure S1a, b), resulting in decrease of \(R_{CRB}\) with increasing *r* (Fig. 3c, d). This faster increase could be explained by the faster decrease of intensity of ’signal’ in 2p-MINFLUX compared to 1p-MINFLUX (Additional file 1: Supplementary note 1). At radius of 6.7 nm and 10.0 nm (representing 2\(\sigma\) and 3\(\sigma\)), average \(R_{CRB}\) are 1.67 and 1.31 respectively. In addition, 2p-MINFLUX CRB (Fig. 3a, S1a) is much more anisotropic than 1p-MINFLUX CRB (Fig. 3b, S1b); a triangle-like contour can be seen in CRB of 2p-MINFLUX (Fig. 3a), but not in 1p-MINFLUX CRB (Fig. 3b). This could also be explained qualitatively by plotting the different ’signals’: in 2p-MINFLUX, ’signals’ decreases faster with *r* for angle 0 than angle 60 (Additional file 1: Supplementary note 1). Anisotropy also exists for CRB under *L* = 100 nm (Additional file 1: Supplementary note 2, Figure S2).

The enhancement of z-localization precision is similar to xy-localization. 3D-donut is modeled simply as a quadratic function [12]. The highest precision at the origin also increases by 2-fold:

$$\begin{aligned}&CRBz_{1p}(\overrightarrow{0})=\frac{L}{4\sqrt{N}}\left( \frac{1}{SBR}+1\right) \end{aligned}$$

(5a)

$$\begin{aligned}&CRBz_{2p}(\overrightarrow{0})=\frac{L}{8\sqrt{N}}\left( \frac{1}{SBR}+1\right) \end{aligned}$$

(5b)

### 2.2 Localization reconstruction

Using maximum likelihood estimation (MLE), z-axis localization, as a 1D problem, can be solved analytically for 2p-MINFLUX as well:

$$\begin{aligned} \hat{z}_{2p}^{(MLE)}=-\frac{L}{2}+\frac{L}{1+\root 4 \of {\frac{n_{1}}{n_{0}}}} \end{aligned}$$

(6)

where \({n}_{0}\) and \({n}_{1}\) are number of photons detected with \(I\left( -\frac{L}{2}\right)\) and \(I\left( \frac{L}{2}\right)\). Imaginary roots and a real root outside of \(\left( -\frac{L}{2}, \frac{L}{2}\right)\) are neglected. In simulation, the above estimator agrees with ground truth, and improved localization precision can be seen (Fig. 4).

For localization estimation in xy-plane, we investigated two methods: maximum likelihood estimation (MLE) and least mean square estimation (LMS), being unbiased and biased respectively. For LMS estimation, a same first-order linearization is used [9]. The LMS estimator can be analytically solved:

$$\begin{aligned}&\hat{\vec {r}}_{1p}^ {(LMS)}=-\frac{1}{1-\frac{L^{2} \ln 2}{fwhm_{1p}^{2}}} \sum _{i=1}^{k} \hat{p}_{i} \vec {r}_{i} \end{aligned}$$

(7a)

$$\begin{aligned}&\hat{\vec {r}}_{2p}^ {(LMS)}=-\frac{1}{2}\frac{1}{1-\frac{L^{2} \ln 2}{fwhm_{2p}^{2}}} \sum _{i=1}^{k} \hat{p}_{i} \vec {r}_{i} \end{aligned}$$

(7b)

where *k* is the number of exposures, and, \(\hat{{p}}_{i}= {n}_{i}/{N}\), where \({n}_{i}\) and \(\vec {{r}}_{i}\) are, respectively, number of collected photons and displacement of excitation beam for each exposure. Similar to the expression of CRB, denominator of LMS estimator is also multiplied by a factor of 2.

For MLE estimation, the log-likelihood function is maximized as classically done. The maximization is solved numerically. MLE estimation should be given a starting value for its convergence. In our simulation, LMS estimator served as this starting value thanks to its simple form. We simulated MINFLUX imaging in xy-plane for *L* = 50 nm with numerically solved MLE estimation (Fig. 5). For photon number *N* = 250, 2p-MINFLUX can already achieve 6 nm resolution, which is barely feasible in 1p-MINFLUX (Fig. 5b, c). In addition, compared with 1p-MINFLUX with *N* = 1000 (Fig. 5d), 2p-MINFLUX with *N* = 250 achieved similar localization distributions, confirming the capability of 2p-MINFLUX to reduce number of photons required.

We believe 2p-MINFLUX would be capable of multicolor localizations (Fig. 5e, f). Because of the spectral overlapping of two-photon absorption peaks, it is possible to excite multiple fluorophores simultaneously. If emission spectra of the multiple fluorophores are not overlapped, then complete separation of different fluorescence can be achieved with simple dichromatic filters. Although it is not a must to use a single excitation for multicolor two-photon microscopy, a single-wavelength two-photon excitation is beneficial for multicolor MINFLUX, as it can be easily achieved with dichromatic beam splitters. In this configuration, multicolor 2p-MINFLUX would be free of registration of different color channels, as they are excited with the same donut coordinates. Hence, this may enable simultaneous registration-free multicolor MINFLUX tracking, which could be crucial for study of molecular interactions. Note that *L* could be adjusted dynamically, and that since CRB worsens with increased *r* (Fig. 3), simultaneous localization gains most benefit only when fluorophores are close enough to the coordinate origin of excitation pattern.