Evolution of seeing in SDSS imaging runs
Table of Contents
1 Introduction
To simulate the effectiveness of seeing-based survey strategy decisions in selecting observing pointing by DES, we need to be able to simulate the evolution of seeing over the course of a night.
To get a general idea of how seeing evolves over time, I examined imaging runs from SDSS. The degree to which the data are applicable is limited by the difference in site: Apache Point Observatory has substatially different weather patterns than CTIO.
2 Theory
Tokovinin 2002 provides a good summary of the seenig theory, as does the introduction to Floyd, Thomas-Osip & Prieto 2010, which also provides a good list of references on the seeing theory. Some other useful references are:
-
the Kolmogorov turbulence model
- Kolmogorov 1941
- Tatarskii 1969
- the von Karman turbulence model
- derivation of the FWHM from the von Karaman model
3 Statistics on PSF sizes in stripe 82 runs
3.1 Selecting data
Because I am primarily interested in the evolution of seeing with time, and do not want differences in seeing caused by changis pointing and airmass to confuse the issue, I will select a set of images on stripe 82, for which the SDSS telescope is kept at constant pointing throughout each run.
I choose data from only one CCD, r3, because atmospheric seeing should be similar in all CCDs.
3.2 The histogram of seeing for SDSS strip 82 runs
Take a quick look at the distribution of seeing:
This looks like in may be log-normal, as suggested by theory, so try plotting the log:
load('/Users/neilsen/Data/SDSS/seeing-3.Rdata') attach(seeingFrame) seeingHist <- "seeingLogHist-3.png" png(seeingHist) hist(logWidth_r,50,xlab="log(FWHM (arcseconds))",ylab="Frames",main="") dev.off()
3.3 Fitting a log-normal distribution
Even the histogram of the log appears obviously non-gaussian. We can try some statistical tests for normality to quantify this:
require(nortest) require(e1071) load('/Users/neilsen/Data/SDSS/seeing-3.Rdata') attach(seeingFrame) ad.test(logWidth_r) cvm.test(logWidth_r) lillie.test(logWidth_r) pearson.test(logWidth_r)
The following object(s) are masked _by_ '.GlobalEnv':
field, psfWidth_r, run
Anderson-Darling normality test
data: logWidth_r
A = 925.4465, p-value = Inf
Cramer-von Mises normality test
data: logWidth_r
W = 142.4181, p-value = Inf
Lilliefors (Kolmogorov-Smirnov) normality test
data: logWidth_r
D = 0.0625, p-value < 2.2e-16
Pearson chi-square normality test
data: logWidth_r
P = 16320, p-value < 2.2e-16
We can also look specifically at the skewness:
require(nortest) require(e1071) load('/Users/neilsen/Data/SDSS/seeing-3.Rdata') attach(seeingFrame) skewness(logWidth_r)
The following object(s) are masked _by_ '.GlobalEnv':
field, psfWidth_r, run
[1] 0.6216421
and kurtosis:
require(nortest) require(e1071) load('/Users/neilsen/Data/SDSS/seeing-3.Rdata') attach(seeingFrame) kurtosis(logWidth_r)
The following object(s) are masked _by_ '.GlobalEnv':
field, psfWidth_r, run
[1] 0.01770088
Indeed, this sample does not follow a normal distribution. In particular, it appears highly skew.
3.4 Check whether the data for all runs is log-skew-normal
Because the devaition from normality is primarily because the distribution is too skew, we can try fitting a sky-normal distribution.
require(sn) load('/Users/neilsen/Data/SDSS/seeing-3.Rdata') attach(seeingFrame) fit_result <- sn.mle(y=logWidth_r,plot.it=FALSE) dp0 <- cp.to.dp(fit_result$cp) print(dp0)
3.5 Check whether the data for all runs is log bimodal
4 Selecting sample SDSS imaging runs
I continue to focus my attention on stripe 82 runs to avoid differencse in PSF arrising from changing telescope position.
Using Tamas Bodavari's tool, sqlcl, for querying the SDSS CAS, I
select the 8 longest runs on stripe 82:
CASNAME=http://cas.sdss.org/stripe82/en/tools/search/x_sql.asp sqlcl -s $CASNAME -q \ "SELECT TOP 8 run, count(*) AS nFields, MAX(airmass_r)-MIN(airmass_r) AS 'airmass range' FROM field WHERE camcol=3 GROUP BY run ORDER BY nFields DESC"
| run | nFields | airmass range |
| 4874 | 990 | 2.75945759E-4 |
| 4849 | 931 | 2.48779133E-4 |
| 5610 | 867 | 3.03274553E-4 |
| 6955 | 851 | 3.60733085E-4 |
| 6580 | 849 | 1.026E-3 |
| 7155 | 846 | 7.10125103E-4 |
| 7161 | 845 | 7.69680979E-4 |
| 6433 | 842 | 3.19541957E-4 |
5 Examining run 4874
5.1 Obtaining the data
Run 4874 is the longest run, so I will start by examining that one. I begin by querying the CAS for the seeing data and storing it in a temporary file:
CASNAME=http://cas.sdss.org/stripe82/en/tools/search/x_sql.asp sqlcl -s $CASNAME -q \ "SELECT field, psfWidth_u, psfWidth_g, psfWidth_r, psfWidth_i, psfWidth_z FROM field WHERE camcol=3 AND run=4874 ORDER BY field " > /tmp/seeing-4874-3.csv
5.2 Plotting the seeing vs. frame number for run 4874
Now, I use R to plot the seeing against the frame number
seeingFrame <- read.csv('/tmp/seeing-4874-3.csv',header=TRUE) save(seeingFrame,file='/tmp/seeing-4874-3.Rdata') attach(seeingFrame) seeingPlot <- "seeing-4874-3.png" png(seeingPlot) par(oma=c(0,0,0,0),mar=c(5,4,1,0)) plot(field,psfWidth_u,type="l",col="violet",ylab="Width (arcseconds)",ylim=c(0.8,2.4)) lines(field,psfWidth_g,col="blue") lines(field,psfWidth_r,col="green") lines(field,psfWidth_i,col="orange") lines(field,psfWidth_z,col="red") dev.off()
5.3 The partial autocorrelation function of run 4874
In the hope that I might be able to use an autoregressive model for the simulated data, I examine the partial autocorrelation function. Using the tools described in Chapter 9 of Data Analysis and Graphics Using R by Maindonald & Braun:
load('/tmp/seeing-4874-3.Rdata') attach(seeingFrame) seeingPartAutocor <- "seeingPartAutocor-4874-3.png" png(seeingPartAutocor) par(oma=c(0,0,0,0),mar=c(5,4,1,0)) acf(log_fwhms,lag.max=200,type="partial") dev.off()
Indeed, the low amplitude of most terms is promising for the use of an autoregressive model.
5.4 Autoregressive model fit on run 4874
In an autoregressive model, the value of some term Xt depends on its predecessors: \[ X_{t} = \mu + \alpha_{1}(X_{t-1} - \mu) + ... + \alpha_{p}(X_{t-p} - \mu) + \epsilon_t \] where εt is the "innovation" at frame t.
Beginning with letting R automatically select the number of terms:
load('/tmp/seeing-4874-3.Rdata') mdl<-ar(seeingFrame$log_fwhms,method="mle") print(paste("mu: ",mdl$x.mean, "innovation variance:",mdl$var.pred)) print("alphas:") print(mdl$ar)
Only the first three terms seem really significant, so limit the model:
load('/tmp/seeing-4874-3.Rdata') mdl<-ar(seeingFrame$log_fwhms,method="mle",order.max=3) print(paste("mu: ",mdl$x.mean, "innovation variance:",mdl$var.pred)) print(paste("alphas:",mdl$ar[1],mdl$ar[2],mdl$ar[3]))
So, our model for run 4874 looks like this: \[ X_{t} = 0.04 \mu + 0.78 X_{t-1} + 0.18 X_{t-3} + 0.24 (X_{t-1}-X_{t-2}) + \epsilon_t \]
6 Look at more individual runs
6.1 Runs to consider
| run | nFields | airmass range |
|---|---|---|
| 4874 | 990 | 2.75945759E-4 |
| 4849 | 931 | 2.48779133E-4 |
| 5610 | 867 | 3.03274553E-4 |
| 6955 | 851 | 3.60733085E-4 |
| 6580 | 849 | 1.026E-3 |
| 7155 | 846 | 7.10125103E-4 |
| 7161 | 845 | 7.69680979E-4 |
| 6433 | 842 | 3.19541957E-4 |
| 7106 | 841 | 5.10767183E-4 |
| 6568 | 840 | 9.50800509E-4 |
| 5622 | 838 | 2.40134091E-4 |
| 6577 | 836 | 9.11053572E-4 |
| 7145 | 836 | 6.04777575E-4 |
| 6504 | 835 | 4.41759324E-4 |
| 6552 | 833 | 7.67624043E-4 |
6.2 Run 4874
6.3 Run 4849
6.4 Run 5610
6.5 Run 6955
6.6 Run 6580
6.7 Run 7155
6.8 Run 7161
6.9 Run 6433
6.10 Run 7106
6.11 Run 6568
Date: 2011-06-15 11:31:48 CDT
HTML generated by org-mode 7.4 in emacs 23