Statistics Department
Navigation Admissions Link Teaching Link Research Link Seminars Link People Link Consulting Program Statistics Home

Programs, Requirements, Courses

Programs and Requirements for the Ph.D.

A student applying to the Ph.D. program should normally have taken advanced calculus, linear algebra, probability and a few courses in statistics. Additional courses in mathematics, especially a course in real analysis, will be helpful. Some familiarity with computers and programming is expected. However, students who have not taken courses in all of these areas should not be discouraged from applying, especially if they have a substantial background, through study or experience, in some area of science or other discipline involving quantitative reasoning and empirical investigation. Because statistics is an empirical and interdisciplinary field, a strong background in some area of potential application of statistics is a considerable asset. Indeed, a student's background in mathematics and in science or another quantitative discipline is more important than his or her background in statistics in determining the ability of the student to do statistical research.

Reflecting the diversity of the students, the program is flexible in terms of the timing and content of coursework and research. The following describes a typical path for a student with a solid background in mathematics and some familiarity with statistics. During the first year, the student takes courses in probability theory and stochastic processes (Statistics 31200 and 31300), mathematical statistics (Statistics 30100, 30200 and 30400), and applied statistics (Statistics 34300, 34500 and 34700). These three areas receive roughly equal emphasis and serve as the foundation for all later work. A substantial component of the applied courses is the use of advanced statistical programming languages, such as Splus, for data analysis. At the start of the second year, the student takes preliminary examinations covering all these areas. During the second year, students take more advanced and specialized courses, depending on their interests. The selection of courses offered varies from year to year, but there is always a variety of courses in probability and in theoretical and applied statistics sufficient to address quite diverse interests. By the end of the second year, students should have begun to work with a thesis advisor and initiate their doctoral research. One common way to get started in research is to take a reading course with a prospective advisor. After making substantial research progress, the student prepares a paper that is distributed to the faculty and students and is discussed in an open departmental workshop. These workshops are held once during both the third and fourth years. A completed dissertation is presented in a formal departmental seminar, and then a final oral examination completes the program for the Ph.D. In recent years, a large majority of our students complete the Ph.D. within four or five years of entering the program. Students who have significant graduate training before entering the program can (and do) obtain their doctor's degree in three years.

Some students must postpone taking one of the usual first-year courses in order to strengthen their background in that area first. A typical variant of the program for such a student involves taking only some parts of the preliminary examinations at the start of the second year, and then taking the remaining parts later in that year. This delay does not usually slow the student's progress through the remainder of the program.

Most students receiving a doctorate proceed to faculty appointments in research universities. A substantial number take positions in government or industry, such as in research groups in the government labs, in communications, in commercial pharmaceutical companies, and in banking/financial institutions. The department has an excellent track record in placing new PhDs.

Program in Biostatistics

Doctoral students with an interest in applying statistical methods and doing research in biology and medicine can do so by tailoring their doctoral program to emphasize biostatistics. Courses are available on a regular basis in areas such as biometry, survival analysis, and clinical trials. The Department of Health Studies, which includes groups in biostatistics and in epidemiology, holds weekly seminars that attract graduate students, physicians, and medical researchers in a setting that encourages discussion. Students in Statistics with an interest in biostatistics can and do work with faculty in Health Studies on their dissertations.

Consulting and Computation

Students in the degree programs are encouraged to complement their training in statistics with experience and study in some field where statistics is important. Courses and study in empirical science and summer employment offer opportunities in this direction. The department operates a consultation program under the guidance of one or more of the faculty, of service mainly to students and faculty throughout the University. All degree candidates in statistics must participate in these consultative activities, at a level appropriate to their training and prior experience, as an integral part of their degree program. An informal seminar meets regularly over lunch to provide a forum for presenting and discussing problems, solutions and topics in statistical consultation. Students present interesting or difficult consulting problems to the seminar as a way of stimulating wider consideration of the problem, and of developing familiarity with the kinds of problems and lines of attack that are involved. Often the consultee will participate in the presentation and discussion.

Facilities

Almost all departmental activities: classes, seminars, computation and student and faculty offices are located in Eckhart Hall or neighboring Ryerson Hall. Each student is assigned a desk in one of several offices. A small departmental library and conference room is a common meeting place for formal and informal gatherings of students and faculty. The mathematics and statistics branch of the University Library is located on the second floor of Eckhart Hall. The major computing facilities of the department are based upon a network of PCs running mainly Linux. One computer room currently houses many of these PCs; these rooms are directly and primarily for graduate students in the
Statistics Department. In addition, all student offices have limited computer facilities.

Teaching

Part of every statistician's job is to evaluate the work of others and to communicate knowledge, experience, and insights. Every statistician is, to some extent, an educator, and we provide our graduate students with training and experience for this aspect of their professional lives. We expect all doctoral students, regardless of their professional objectives and sources of financial support, to take part in a graduated program of participation in some or all phases of instruction, from grading, course assisting, conducting discussion sections to being a lecturer with responsibility for an entire course.

Statistics Throughout the University

In addition to the courses, seminars and programs in the Department of Statistics, courses and workshops of direct interest to statisticians occur throughout the University, most notably in the programs in statistics and econometrics in the Graduate School of Business and in the research programs in Health Studies, Human Genetics, Financial Mathematics, Computer Science, Economics and NORC (formerly the National Opinion Research Center).

Programs and Requirements for the M.S.

The background needed for M.S. students is essentially the same as for Ph.D. students except that courses in mathematics beyond advanced calculus and linear algebra are less crucial. The main requirements of the M.S. program are a sequence of nine approved courses plus a Master's paper. The nine required courses must include a three-quarter sequence in applied statistics and a master-level three-quarter sequence in mathematical statistics (Statistics 24400, 24500, and 24600). The other three courses are chosen in consultation with the student's advisor. The Master's paper is not a thesis and does not require original research. A common topic for a paper is a thorough analysis of some real data set. In addition to the coursework and paper, M.S. students are also expected to participate in the departmental consulting program. For a student with the appropriate background, the M.S. program can be readily completed in one calendar year. More information about the M.S. program is contained in a separate document.


Courses

Courses in the first list are offered each year and are intended mainly for undergraduates and for graduate students from disciplines other than statistics, except for Statistics 24400-24500-24600, which M.S. students may count towards their degrees.

20000 Elementary Statistics. Introduction to statistical concepts and methods for the collection, analysis, interpretation, and presentation of data. Elements of sampling; simple techniques for analysis of means, proportions, and linear association are used to illustrate both effective and fallacious uses of statistics.

22000 Statistical Methods and Their Applications. An introduction to statistical techniques and methods of data analysis, including use of computers. Examples will be drawn from the natural and social sciences. Topics include: data reporting, random variation and sampling, one- and two-sample problems, the analysis of variance, linear regression, and analysis of discrete data.

22200 Linear Models and Experimental Design. Principles and techniques for analyzing experimental data and planning the statistical aspects of experiments, surveys and observational programs. Linear and nonlinear models; analysis of variance and response surface analysis; randomization, blocking, and factorial designs; fractional replication and confounding; incorporating covariate information; split-plot and nested experiments; components of variance.

22400 Applied Regression Analysis. Methods and applications of fitting and interpreting multiple regression models with emphasis on the method of least squares; examination of residuals, transformation of data, selecting regression models, dummy variables, tests of fit, biases due to excluded variables and measurement error, use and interpretation of computer package regression programs.

22600 Analysis of Qualitative Data. The interplay between statistical and subject-matter issues in the choice and interpretation of log-linear and logistic models for quantal, ordinal and nominal responses is emphasized. Sampling models, response scales, and model construction. Applications in the biological and social sciences emphasize interpretation rather than computational procedure.

23400-23500 Statistical Models and Methods I, II. This course presents basic ideas of probability theory and statistics, and is recommended for students throughout the natural and social sciences who want a broad background in statistical methodology and exposure to the probability models and statistical concepts underlying the methodology. Probability is developed for the purpose of modeling outcomes of random phenomena. Some models are studied mathematically and others via simulation on a computer. Binomial, Poisson, normal and other standard probability distributions are considered. Statistical methods for describing data and making inferences based on samples from populations are presented. Methods are illustrated on examples and studied via simulation. Topics include Bayesian inference, maximum likelihood estimation, and repeated-sampling frequentist inference. Methods for one- and two-sample problems, analysis of variance, analysis of counted data, and correlation and regression are studied. Graphical and numerical data description are used for exploration, communication of results, and comparing mathematical consequences of probability models and data. Mathematics is employed to the level of univariate calculus but is less demanding than that required by STAT 24400-24500. Other than the mathematical level, the content of the two sequences are similar.

24000 Statistics and Probability for the Natural Sciences. An introduction to probability and statistics and their application to problems in the physical sciences.

24400-24500-24600 Statistical Theory and Methods. Principles and techniques of statistics with emphasis on the analysis of experimental data. Discrete and continuous probability distributions, transformation of random variables; principles of inference including Bayesian inference, maximum likelihood estimation, hypothesis testing and confidence intervals, likelihood-ratio tests, multinomial distributions and chi-square tests; analysis of variance and regression analysis.

25100 Introduction to Mathematical Probability. Fundamentals and axioms; combinatorial probability; conditional probability and independence; binomial, Poisson, and normal distributions; weak law of large numbers and central limit theorem; random variables, generating functions.

26700 History of Statistics. This course will focus on the period 1650 to 1920, and on mathematical developments in the theory of probability and how they came to be used in the sciences, both to quantify uncertainty in observational data and as a conceptual framework for scientific theories. The course will include closer looks at specific people and investigations, including reanalyses of historical data.


 

Courses in the second list are offered each year or frequently in alternate years and are intended, primarily but not exclusively, for graduate students in statistics.

30100-30200 Mathematical Statistics. The mathematical structure of modern statistics including statistical models, parameter estimation, comparison of estimators, efficiency, confidence sets, theory of hypothesis tests, introduction to Bayesian analysis, elements of decision theory and asymptotic methods.

30400 Distribution Theory. Univariate and multivariate distributions and densities. Computer algebra. Rigorous treatment of limit switching. Moment and cumulant generating functions. Characteristic functions. Convergence in distribution. Central limit theorems. Edgeworth expansions.

30700 Numerical Computation. Introduction to statistical and numerical computation in C++.

31200-31300 Introduction to Stochastic Processes. An introduction to stochastic processes not requiring measure theory. Topics covered include branching processes, recurrent events, renewal theory, random walk, Markov chains, Poisson and birth-and-death processes, continuous processes such as Brownian motion, martingales.

32000 Bayesian Statistics. Basic concepts such as models and parameters, Bayes's theorem, prior and posterior distributions, and recent advances in computational techniques: asymptotic approximation, Laplace expansions, numerical integration, iterative or exact computations that exploit conditional independence structures. Bayesian data analysis.

32900 Applied Multivariate Analysis. Basic techniques for the analysis of multidimensional data in application to problems in science and business. Graphical representation, descriptive and diagnostic statistics; the multivariate normal distribution; multivariate linear regression; principal components and factor analysis; discrimination, classification and clustering.

33100 Sample Surveys. Development of the classical techniques of sample surveys, random sampling methods, stratification, cluster sampling, ratio estimation, and elaborations of these ideas with due attention to derivations and limitations. Methods for dealing with nonresponse and partial response are addressed.

34300 Applied Linear Statistical Methods. The theory, methods and applications of fitting and interpreting multiple regression models with emphasis on the method of least squares. Residuals, transformations, selection of regression equations, tests of fit, nonlinear models, biases due to excluded variables and measurement error; relation to linear algebra, effects of violations of assumptions.

34500 Design and Analysis of Experiments. The methodology and application of linear models in experimental design with a focus on the principles of experimental design, such as blocking, randomization, fractionation and confounding. Analysis of designed experiments with emphasis on the role of fixed and random effects.

34700 Generalized Linear Models. A thorough introduction to the theory, methods and applications of "Generalized linear models." Factors, variates, interactions; exponential-family models and variance functions. Specific examples include logistic and probit regression, cumulative logistic models, log-linear models and contingency tables. Quasi-likelihood, least squares and partially linear models.

36700 History of Statistics.This course will focus on the period 1650 to 1920, and on mathematical developments in the theory of probability and how they came to be used in the sciences, both to quantify uncertainty in observational data and as a conceptual framework for scientific theories. The course will include closer looks at specific people and investigations, including reanalyses of historical data.

38100-38200-38300 Measure-Theoretic Probability. A detailed, rigorous treatment of probability from the point of view of measure theory, including a development of the latter subject: existence theorems, integration and expectation, characteristic functions, moment problems, limit laws, Radon-Nikodym derivatives, conditional probabilities, martingales, Brownian Motion.

39000-39100 Stochastic Calculus and Finance I, II. This course is an introduction to stochastic calculus as it is relevant to the pricing and hedging of options and other derivative securities. It is offered in collaboration with the master's program in Mathematical Finance. At the end of the first course you should be able to use Ito's lemma, Girsanov's theorem, martingale representation and martingale limit theory to evaluate concrete derivatives. The second course concerns incompleteness of markets, and statistical issues.

44100 Consulting in Statistics.


 

The third list includes courses that have been taught in recent years. Each year a selection of advanced or specialized courses similar to these are offered, the choice depending on the current interests of the faculty and students.

30500 Theoretical Statistics. The foundations of statistical inference; the likelihood principle, the sufficiency and conditionality principles, Bayesian and frequency-based inferences. The definition, construction and interpretation of significance tests and confidence intervals, both approximate and exact, form a major component of the course.

31400 Tensor Methods in Statistics. An advanced course emphasizing aspects of multivariate calculation. Multivariate moments and cumulants, set partitions and Mobius inversion, k-statistics and symmetric functions, stochastic expansion, Edgeworth and saddlepoint approximation, and applications to likelihood-ratio and related statistics. The emphasis is on application of methods rather than on conditions for asymptotic validity.

32200 Bayesian Data Analysis. This is a comprehensive treatment of the statistical analysis of data from a Bayesian perspective. Modern computational tools are emphasized, and inferences are typically obtained using computer simulations. Emphasis is given on gaining first-hand experience of model-based data analysis, and the course is designed in a lecture/presentation interactive fashion to maximize the chance of learning through practicing. Participants are expected to have good command of basic statistical knowledge as well as computing skill.

33400 Applied Forecasting.

33500 Time Series Analysis. Linear time series models with application to real world problems. Stationary and nonstationary processes; linear dynamic models; autoregressive and moving average models; iterative model building and diagnostic checking; prediction theory, signal extraction and Kalman filter; asymptotic theory of parameter estimates; transfer function models and intervention analysis.

33900 Spatial Statistics. Interpolation of spatial processes based on limited observations is a fundamental problem in spatial statistics and is the focus of this course. Topics in probability and mathematical and computational statistics are developed as needed to provide the tools for a detailed understanding of statistical approaches to spatial interpolation.

35000 Fundamentals of Epidemiology. See HlthSt 30900.

35100 Advanced Epidemiology. See HlthSt 31100.

35500 Statistical Genetics. The main topic is the mapping of human disease genes and genetic markers by the analysis of pedigree data, and addressing the major statistical and computational problems in the analysis of big pedigrees with complex genetic models.

35600 Introduction to Survival Analysis. See HlthSt 33100.

37900  Computer Vision.

39800 Field Research (summer quarter only).

39900 Master's Seminar.

45500 Statistical Genetics (50 unit course).

45600  Workshop in Genetics.

Last update: 8/29/07