Thursday, November 13, 2014

Reflections on programming (cross-post)

For some unknown obsessive-compulsive reason, I've taken to sorting my infrequent blog posts by topic. This one linked below wasn't really about survey methodology per se, so I posted it elsewhere. Thought some of you may be interested, though.

http://researchefficiency.blogspot.com/2014/11/the-long-road-that-is-short.html

Tuesday, July 8, 2014

A beginners guide to response rates

One of the most common types of questions I get in survey practices is "What is a good response rate?" or "Is my survey's response rate good enough? Do I have nonresponse bias?" Survey methodologists reading this are probably taking a deep breath and figuring out where to start their response. Here are are few things I think everyone should know about response rates (non-technical...I will post later on AAPOR response rate calculation). 

1) The answer depends on what you mean by "good". 

"Good" can mean "high enough to publish in a specific journal," or, "high overall (e.g. 80-90% or more)," or what people usually want to know, "Are my results biased?"

"Good" might also mean "Do (will) I have enough cases for key analyses?". 

In my mind, "good" should mean "relative to other surveys in the same mode with similar design features". We just can't expect 50% RR's from RDD surveys and shouldn't get upset when we don't see them. 

2) Any statement about survey "goodness" or data quality has to be conditional on the amount of resources spent/spendable. 

'nuff said.

3) Response rates are good for some things, but not others.

Good for:

a) Tracking an ongoing survey's performance over time
b) Comparing surveys that are conducted under the same or very similar "essential survey conditions" (e.g., mode, contact materials and protocols, costs/resources)
c) Planning survey costs, inference (CI's and power analyses), and number of completed cases
d) Making initial assessments of approximate representation of key subgroups

Not good for:

a) Assessing nonresponse bias. See work by Groves, Groves & Peytcheva and others. This is lesson number 1 or 2 in survey methodology training, but often isn't intuitive outside our field until explained. Statisticians usually understand this inherently, but substantive researchers may not. Easily taught though. 

On a related note, I was just reviewing notes from Jill Montequilla and Kristen Olson's short course "Practical Tools for Nonresponse Bias Studies." If you want to learn more about how to asses NR bias, I recommend taking the course. 

Friday, July 4, 2014

My other blog...

I've debated about how much to post here that isn't specifically about survey methodology, so I started a secondary blog at
http://researchefficiency.blogspot.com
It seems like a lot of the things I want to blog about lately are research practice, coding, project management, efficiency, etc. The new blog will be the outlet for those kinds of topics (with some cross-posting of course). See my recent post on developing a personal code library and an earlier one on Excel shortcuts.
http://researchefficiency.blogspot.com/2014/07/developing-your-personal-statistical.html

Tuesday, June 3, 2014

Training in survey methodology and practice


There's an upcoming DC AAPOR event on survey methodology training in DC on June 13 (http://dc-aapor.org/upcomingevents.php).  I can't attend so I thought I'd share some of my own thoughts on the matter here. I think about this topic from three different perspectives. 

As a survey methodology instructor and trainer of future methodologists: 
  1. Instructors should distinguish clearly between whether their training (e.g., their course or degree curriculum) is about "survey research practice" or "survey methodology" or what fraction of each. Those seeking practice training can be turned off by methodological debates and esoterica, and the line between esoterica and fundamentals isn't always clear, particularly in an interdisciplinary field like survey methodology. Survey methodology is an applied, yet scientific field, and should have a balance of both perspectives.
  2. While a methodology focus trains the next generation of scientists and leaders, it may not give one a good enough broad-based training in concrete techniques because the focus is on isolating and filling gaps in small areas of the field. That doesn't mean that graduate programs can't have both. For example an MS program could have an applied track, for those who want to go to work after training, and a "theory" (for lack of a better word) track for those who want to go to the PhD.
  3. Include official, sanctioned specializations (see student point 2 below) outside of survey methodology programs.
  4. Use Bloom's taxonomy of learning when planning courses. I've used this in my own and it helps operationalize clear course outcomes and goals,  structure the course to meet them. Otherwise we just end up teaching what we happened to learn in the way we happened to learn it, and may not be optimizing instruction and student experiences for the outcomes we want them to have. 
As a student:
  1. Stats v. Social Science focus: I'm sure opinions are split on this (my own opinion is split depending on the context). Groves (as you might expect) wanted us each to be strong in all of it (and "Do it better than we did." A tall order). On the other hand, the broader you go in topics, the less focused you can be. I'm glad I pushed my statistical boundaries and learned things I never thought I would learn. Although I still consider myself more of a social scientists, I can practice at a level of statistics I never thought I would. The counter argument is that it's been hard to focus on one or two problems and get research done. If you're going to go broad, make sure you get things out and published regularly so you don't end up with a scattered CV.
  2. Talks are fun but only pubs really matter (can't emphasize that enough now that I'm out). Take extra time to publish before life gets in the way (e.g., an extra 6 mos or year before defending or a postdoc instead of "regular job". I guess this means faculty should be giving you room to publish (either co-authoring or solo papers based on class projects). MPSM/JPSM have good models for this in the Practicum, TSE and Design Seminar courses.
     
  3. Read and study outside survey methodology: Not just to find your field of application, but to find areas that will advance survey methodology. For example, I took social psych courses and read the communications and linguistics literature in my graduate work. I still try to keep a portion of an eye on decision science and other psychological and behavior economic research that has something to say about measurement and nonresponse "decisions". I'm sure there are parallels in statistical work (e.g., estimation techniques or applied problems that aren't in the main view of usual survey statistics).

As an employer:
  1. I expect (or hope) that students coming out of formal survey methodology training (v. social, psych, or education research methods in another field/discipline or from stats programs) will have a balance of conceptual perspective and concrete skills. For example, I expect JPSM, MPSM, and SRAM students (or those who take my course) to have a handle the TSE framework and terminology, at least at a level that facilitates discussion so we can quickly/easily zero in on whether we're talking about coverage error, sampling error, or what. I don't know if every SM program is teaching this (or a similar) model, but we need something that moves us from niche jargon to relatively standard technical terminology. I'm probably biased, but I feel like TSE does that (well as some of the other frameworks out there). Terminology and models are a core part of the science of survey methodology in my mind, but I also expect grads to be able to DO things.
  2. I expect soc and stat side students to have decent quant skills (both interpretation and production). More so if on stats side. It doesn't seem right to me (given the current social science paradigm) to turn out students that can't do basic analysis, basic experiment design, or understand the basics of survey weights and variance estimation. Students should seek this kind of training if their program does't provide it. I would expect even MS students to have a working knowledge of these things and be able to refresh as needed.
  3. If I was hiring an MS level soc-side person I would expect these classes
    1. Data collection methods
    2. Questionnaire design
    3. Applied Sampling
    4. Cognition - or course on social aspects of measurement
    5. Practicum courses
      1. Covering nonresponse avoidance and sampling techniques...really "how to"
    6. Intro stats (2 semesters, through at least linear and logistic regression)
    7. Analysis of complex sample survey data
  4. If I was hiring an MS level stat person I would expect these classes
    1. Data collection methods
    2. Applied Sampling
    3. Sampling theory (or something more mathematical than applied sampling)
    4. Missing data/imputation
    5. Practicum courses
      1. Covering analysis
    6. Intro stats (3 semesters, through at least linear and logistic regression)
    7. Analysis of complex sample survey data
    8. Advanced variance estimation
    9. Introduction to latent variable models
      1. Pref with some exposure to complex survey data
    10. Introduction to longitudinal analysis
      1. Pref with some exposure to complex survey data

Friday, May 23, 2014

Methods of efficiency

I'm convinced that micro-level behaviors, habits, and actions are just as important to becoming a productive researcher as having big ideas. I've been working on improving those things over the past year. You could call these things the "methods of doing research work" but many apply to other kinds of creative work, technical work, and project management. We don't talk about them a lot in professional circles because they're not the big/sexy ideas that change the world in one fell swoop. However, they are the mitochondria of our research cells, and I think we should share tips and tricks like this much more often for the larger benefit of the field.

My wife and I had breakfast with our friends Mario and Ana this past week and we barely got to share personal stories because we were sharing efficiency tips and tricks the whole time. Here are two pieces of software I've come to love (M & A, one is the thing I couldn't recall the name of and another I just found this week). Both reduce the keying/mousing you have to do, which seems small but adds up. Autohotkey lets you program scripts and macros for any key combination or mouse movement so is VERY versatile and great for be jobs that require repeated key/mouse movements. Breevy (just started using today) lets you record keyboard shortcuts and text-expansion phrases like you can do in Word with Autocomplete/correct, but works across all programs in Windows. Sure beats programming specific kb shortcuts within individual programs.

Mention other favorites if you have them.

Job Opening at NASS

I'm not sure I'm brazen enough to believe that my blog reaches more people than the SRMS and AAPOR listservs, but I thought I'd post this NASS job opening to help out a colleague. NASS has always seemed like a fun and innovative place to me. And the federal home of Likert of course :)

*********************************************************
Hello all,

We are looking to fill a senior level mathematical statistician position.  It will be a great opportunity for the right person.

The description is below, and the job will be open for applications until June 5.  Please pass along to any other interested candidates.  Thanks!


The U.S.D.A.’s National Agricultural Statistics Service (NASS) is searching for a senior mathematical statistician (ST-1529-00) who will serve as the Research and Development Division’s Deputy Director for Science and Planning. The National Agricultural Statistics Service (NASS) is the data collection and dissemination arm of the U.S. Department of Agriculture. NASS gathers and publishes a vast array of information on every facet of U.S. agriculture, including production, economics, demographics, and the environment. The incumbent serves NASS as a research statistician in mathematical statistics for agricultural surveys and censuses, geospatial techniques, statistical modeling for estimation and process measurement. Primary qualifications include a senior science degree of technical skill in mathematical statistics and probability sampling, especially in the area of geospatial analyses, model-based estimators, and non-sampling errors. Research activities include advanced survey sampling design and estimation methods and theory; geospatial estimation methods and theory; measurement error models; nonsampling error models; list, area and multiple frame sampling methods and theory; forecasting techniques; statistical modeling for estimation; and multivariate and quality control methods. The incumbent will lead teams of mathematical statisticians and serves in research management and science leadership by advising the Administrator of NASS, Director of Research and Development Division and Director of the Statistical Methodology on statistical issues and methodology affecting NASS programs. The incumbent’s research efforts will be focused about 90 percent internally, on improving the Agency’s census and survey estimation programs and research, and about 10 percent split between liaison activities with the statistical community as a whole and reimbursable consulting with external entities. More information on the position and how to apply may be found using the following link: https://www.usajobs.gov/GetJob/ViewDetails/370711100




Jaki S. McCarthy
Senior Cognitive Research Methodologist
USDA's National Agricultural Statistics Serivce
Research and Development Division

Tuesday, May 20, 2014

New blog name

I decided to change my blog name today (same URL...no worries). No one said they were offended by the "meth addict" joke, but I figured I should change it if I want this to be my "official" professional blog. For the record, here is the text that I had as a footer to the title when it was named "Diary of a Meth Addict"...just to show that although I have a dark sense of humor sometimes, I'm a sensitive guy.

"This blog discusses survey methodology, statistical methodology and social science research. It does not discuss addictions to methamphetamine or other drugs. If you or a loved one is dealing with that kind of meth addiction, you may find help at the site linked here "

In one last attempt at potentially-inappropriate humor, here are some helpful resources in case you or a loved one is dealing with my kind of meth addiction.

http://surveyresearch.uconn.edu/
http://psm.isr.umich.edu/
http://si.isr.umich.edu/
http://www.jpsm.umd.edu/
http://sram.unl.edu/
http://www.uic.edu/cuppa/pa/srm/



Friday, May 9, 2014

Summer online course in Methods of Data Collection

My online Methods of Data Collection course at UConn is short on students, and the program will decide next week (week of 5-11) whether to run it or drop it. It's scheduled to run June 2 - Aug 8. If you're interested, please contact diane.clokey@uconn.edu to register. Please share with colleagues, students and friends. Thanks.

Course Descrption

PP 5397 H02 – class #2133 -- Methods of Data Collection, Dr. Matthew Jans
This course explores the many challenges of survey data collection, and highlights points in the data collection process where survey error can be introduced (intentionally or unintentionally). Using the Total Survey Error framework and supporting that framework with contemporary and classical research findings, you will learn how to evaluate potential survey errors related to various methods of data collection. The course goal is to help you become a savvy consumer and designer of survey research. Along the way you will learn about new techniques and facets of survey methodology, but this is not a "how to" course. Rather than learning one or two ways to design a questionnaire or a sample, you will learn and practice methods for assessing the quality of survey data and survey designs. This kind of training is rare, but it will prepare you to work in a variety of survey research jobs, whether you are more of a statistician or more of a social scientist, and whether your interests are on the academic side or the applied side of our field. We will cover the full range of data collection components and error sources, including coverage,sampling, nonresponse, and measurement. Mode effects and interviewer effects will be studied, and there will be lessons that focus specifically on paradata and web surveys.

Wednesday, April 30, 2014

A place for response rates: Why we still need them


In the survey methodology and statistics worlds it can be easy to become dismissive of response rates as a method of evaluating survey goodness. We know that they don't indicate bias per se (and are often poor indicators of it). We know they really apply to individual questions or estimates, but are often only reported at the survey level. We know there are more accurate and statistically-satisfying ways to measure nonresponse error, not to mention general survey and estimate quality. 

Yet I think there is still an important role for response rates. I've been readings a lot of technical survey documentation lately, which has me reflecting on our practices. 

1) Formalizing response rates (as AAPOR and CASRO have done), give the field a starting guidepost for evaluating surveys. Even if RRs aren't the final word on quality, standardization makes comparison across surveys easier. Having a shorthand (e.g., "we used AAPOR RR4") also makes scientific communication quicker, so we avoid having to explain exactly how the response rate was calculated every time we present one. This has to be better than the days before the AAPOR Standard Definitions

2) The definition of that rate, that is, deciding what to include or exclude from a response rate (or other similar rate) helps clarify what type of "participation" is being measured. Is it participation among cases that are eligible and contacted (e.g., a cooperation rate)? A cooperation rate can be used as a measure of efficacy of contact materials or interviewers in gaining cooperation by setting aside the tasks of getting contact. A response rate would be less clean of a measure for that purpose. Are respondents only included if they complete the entire survey, or are partially-complete interviews included in the numerator as equivalent to complete interviews (AAPOR RR2, RR4 and RR6, and COOP2 and COOP4)? If we want a measure of "sampled units from which we have any data (or key variables)" including partials makes sense. If we want a measure of "response among committed question answers," or "fully complete data" excluding them is better.

3) Having a formalized set of disposition and response rate terminology makes it easy to explain our methods and practices to people in other fields (including clients who might not be familiar with survey sampling and data collection). A close read of the Standard Definitions is a great tutorial on the logistics of survey response and response rate calculation. 

4) More than just a tool for scientific communication, standardized disposition definitions and response rate formulas embody our field's ethics of clarity in and disclosure of methods. They are akin to reporting question and response option wording whenever reporting survey research. 

While we continue to look for new indicators of data quality and estimate representativeness, I think response rates are hear to stay. 

Wednesday, April 16, 2014

Survey Methodology Terminology 101: "Method and Methodology"

Over the past few years I've worked in various sectors and with people from various academic and professional backgrounds. As a survey methodologist I'm tuned in to how people talk about methodology. I've noticed a few interesting uses of the term. Let me know what you think (or if you've seen the same uses).

1) Methodology v. Methods: Adding "ology" usually means "the science of" (e.g., psychology is the science of the psyche), so shouldn't "methodology" only mean "the study of methods" (as it does in survey methodology)? If that's the case, shouldn't technical documentation and methods sections of  journal articles use the term "method", as in "the methods used for this study..." instead of "the methodology used for this study?" Of course, if you follow that rule, we misuse "psychology" all the time.

2) I've seen some researchers use the term "methods" (or methodology) to mean "everything except the statistical analysis" (e.g., survey or data collection, experimental design, sample). Yet I've also noticed that some statisticians use the term "method" to refer to the analysis method (e.g., using linear regression v. logistic regression).

Where would you draw the line between "methods" and other researchy things, and how do you think we should use the terms "methods" and "methodology"?