Enago Academy

Unraveling Research Population and Sample: Understanding their role in statistical inference

' src=

Research population and sample serve as the cornerstones of any scientific inquiry. They hold the power to unlock the mysteries hidden within data. Understanding the dynamics between the research population and sample is crucial for researchers. It ensures the validity, reliability, and generalizability of their findings. In this article, we uncover the profound role of the research population and sample, unveiling their differences and importance that reshapes our understanding of complex phenomena. Ultimately, this empowers researchers to make informed conclusions and drive meaningful advancements in our respective fields.

Table of Contents

What Is Population?

The research population, also known as the target population, refers to the entire group or set of individuals, objects, or events that possess specific characteristics and are of interest to the researcher. It represents the larger population from which a sample is drawn. The research population is defined based on the research objectives and the specific parameters or attributes under investigation. For example, in a study on the effects of a new drug, the research population would encompass all individuals who could potentially benefit from or be affected by the medication.

When Is Data Collection From a Population Preferred?

In certain scenarios where a comprehensive understanding of the entire group is required, it becomes necessary to collect data from a population. Here are a few situations when one prefers to collect data from a population:

1. Small or Accessible Population

When the research population is small or easily accessible, it may be feasible to collect data from the entire population. This is often the case in studies conducted within specific organizations, small communities, or well-defined groups where the population size is manageable.

2. Census or Complete Enumeration

In some cases, such as government surveys or official statistics, a census or complete enumeration of the population is necessary. This approach aims to gather data from every individual or entity within the population. This is typically done to ensure accurate representation and eliminate sampling errors.

3. Unique or Critical Characteristics

If the research focuses on a specific characteristic or trait that is rare and critical to the study, collecting data from the entire population may be necessary. This could be the case in studies related to rare diseases, endangered species, or specific genetic markers.

4. Legal or Regulatory Requirements

Certain legal or regulatory frameworks may require data collection from the entire population. For instance, government agencies might need comprehensive data on income levels, demographic characteristics, or healthcare utilization for policy-making or resource allocation purposes.

5. Precision or Accuracy Requirements

In situations where a high level of precision or accuracy is necessary, researchers may opt for population-level data collection. By doing so, they mitigate the potential for sampling error and obtain more reliable estimates of population parameters.

What Is a Sample?

A sample is a subset of the research population that is carefully selected to represent its characteristics. Researchers study this smaller, manageable group to draw inferences that they can generalize to the larger population. The selection of the sample must be conducted in a manner that ensures it accurately reflects the diversity and pertinent attributes of the research population. By studying a sample, researchers can gather data more efficiently and cost-effectively compared to studying the entire population. The findings from the sample are then extrapolated to make conclusions about the larger research population.

What Is Sampling and Why Is It Important?

Sampling refers to the process of selecting a sample from a larger group or population of interest in order to gather data and make inferences. The goal of sampling is to obtain a sample that is representative of the population, meaning that the sample accurately reflects the key attributes, variations, and proportions present in the population. By studying the sample, researchers can draw conclusions or make predictions about the larger population with a certain level of confidence.

Collecting data from a sample, rather than the entire population, offers several advantages and is often necessary due to practical constraints. Here are some reasons to collect data from a sample:

research work on population

1. Cost and Resource Efficiency

Collecting data from an entire population can be expensive and time-consuming. Sampling allows researchers to gather information from a smaller subset of the population, reducing costs and resource requirements. It is often more practical and feasible to collect data from a sample, especially when the population size is large or geographically dispersed.

2. Time Constraints

Conducting research with a sample allows for quicker data collection and analysis compared to studying the entire population. It saves time by focusing efforts on a smaller group, enabling researchers to obtain results more efficiently. This is particularly beneficial in time-sensitive research projects or situations that necessitate prompt decision-making.

3. Manageable Data Collection

Working with a sample makes data collection more manageable . Researchers can concentrate their efforts on a smaller group, allowing for more detailed and thorough data collection methods. Furthermore, it is more convenient and reliable to store and conduct statistical analyses on smaller datasets. This also facilitates in-depth insights and a more comprehensive understanding of the research topic.

4. Statistical Inference

Collecting data from a well-selected and representative sample enables valid statistical inference. By using appropriate statistical techniques, researchers can generalize the findings from the sample to the larger population. This allows for meaningful inferences, predictions, and estimation of population parameters, thus providing insights beyond the specific individuals or elements in the sample.

5. Ethical Considerations

In certain cases, collecting data from an entire population may pose ethical challenges, such as invasion of privacy or burdening participants. Sampling helps protect the privacy and well-being of individuals by reducing the burden of data collection. It allows researchers to obtain valuable information while ensuring ethical standards are maintained .

Key Steps Involved in the Sampling Process

Sampling is a valuable tool in research; however, it is important to carefully consider the sampling method, sample size, and potential biases to ensure that the findings accurately represent the larger population and are valid for making conclusions and generalizations. While the specific steps may vary depending on the research context, here is a general outline of the sampling process:

research work on population

1. Define the Population

Clearly define the target population for your research study. The population should encompass the group of individuals, elements, or units that you want to draw conclusions about.

2. Define the Sampling Frame

Create a sampling frame, which is a list or representation of the individuals or elements in the target population. The sampling frame should be comprehensive and accurately reflect the population you want to study.

3. Determine the Sampling Method

Select an appropriate sampling method based on your research objectives, available resources, and the characteristics of the population. You can perform sampling by either utilizing probability-based or non-probability-based techniques. Common sampling methods include random sampling, stratified sampling, cluster sampling, and convenience sampling.

4. Determine Sample Size

Determine the desired sample size based on statistical considerations, such as the level of precision required, desired confidence level, and expected variability within the population. Larger sample sizes generally reduce sampling error but may be constrained by practical limitations.

5. Collect Data

Once the sample is selected using the appropriate technique, collect the necessary data according to the research design and data collection methods . Ensure that you use standardized and consistent data collection process that is also appropriate for your research objectives.

6. Analyze the Data

Perform the necessary statistical analyses on the collected data to derive meaningful insights. Use appropriate statistical techniques to make inferences, estimate population parameters, test hypotheses, or identify patterns and relationships within the data.

Population vs Sample — Differences and examples

While the population provides a comprehensive overview of the entire group under study, the sample, on the other hand, allows researchers to draw inferences and make generalizations about the population. Researchers should employ careful sampling techniques to ensure that the sample is representative and accurately reflects the characteristics and variability of the population.

research work on population

Research Study: Investigating the prevalence of stress among high school students in a specific city and its impact on academic performance.

Population: All high school students in a particular city

Sampling Frame: The sampling frame would involve obtaining a comprehensive list of all high schools in the specific city. A random selection of schools would be made from this list to ensure representation from different areas and demographics of the city.

Sample: Randomly selected 500 high school students from different schools in the city

The sample represents a subset of the entire population of high school students in the city.

Research Study: Assessing the effectiveness of a new medication in managing symptoms and improving quality of life in patients with the specific medical condition.

Population: Patients diagnosed with a specific medical condition

Sampling Frame: The sampling frame for this study would involve accessing medical records or databases that include information on patients diagnosed with the specific medical condition. Researchers would select a convenient sample of patients who meet the inclusion criteria from the sampling frame.

Sample: Convenient sample of 100 patients from a local clinic who meet the inclusion criteria for the study

The sample consists of patients from the larger population of individuals diagnosed with the medical condition.

Research Study: Investigating community perceptions of safety and satisfaction with local amenities in the neighborhood.

Population: Residents of a specific neighborhood

Sampling Frame: The sampling frame for this study would involve obtaining a list of residential addresses within the specific neighborhood. Various sources such as census data, voter registration records, or community databases offer the means to obtain this information. From the sampling frame, researchers would randomly select a cluster sample of households to ensure representation from different areas within the neighborhood.

Sample: Cluster sample of 50 households randomly selected from different blocks within the neighborhood

The sample represents a subset of the entire population of residents living in the neighborhood.

To summarize, sampling allows for cost-effective data collection, easier statistical analysis, and increased practicality compared to studying the entire population. However, despite these advantages, sampling is subject to various challenges. These challenges include sampling bias, non-response bias, and the potential for sampling errors.

To minimize bias and enhance the validity of research findings , researchers should employ appropriate sampling techniques, clearly define the population, establish a comprehensive sampling frame, and monitor the sampling process for potential biases. Validating findings by comparing them to known population characteristics can also help evaluate the generalizability of the results. Properly understanding and implementing sampling techniques ensure that research findings are accurate, reliable, and representative of the larger population. By carefully considering the choice of population and sample, researchers can draw meaningful conclusions and, consequently, make valuable contributions to their respective fields of study.

Now, it’s your turn! Take a moment to think about a research question that interests you. Consider the population that would be relevant to your inquiry. Who would you include in your sample? How would you go about selecting them? Reflecting on these aspects will help you appreciate the intricacies involved in designing a research study. Let us know about it in the comment section below or reach out to us using  #AskEnago  and tag  @EnagoAcademy  on  Twitter ,  Facebook , and  Quora .

' src=

Thank you very much, this is helpful

Very impressive and helpful and also easy to understand….. Thanks to the Author and Publisher….

Rate this article Cancel Reply

Your email address will not be published.

research work on population

Enago Academy's Most Popular Articles

Gender Bias in Science Funding

  • Diversity and Inclusion
  • Trending Now

The Silent Struggle: Confronting gender bias in science funding

In the 1990s, Dr. Katalin Kariko’s pioneering mRNA research seemed destined for obscurity, doomed by…

Content Analysis vs Thematic Analysis: What's the difference?

  • Reporting Research

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for data interpretation

In research, choosing the right approach to understand data is crucial for deriving meaningful insights.…

Addressing Biases in the Journey of PhD

Addressing Barriers in Academia: Navigating unconscious biases in the Ph.D. journey

In the journey of academia, a Ph.D. marks a transitional phase, like that of a…

Cross-sectional and Longitudinal Study Design

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right approach

The process of choosing the right research design can put ourselves at the crossroads of…

Networking in Academic Conferences

  • Career Corner

Unlocking the Power of Networking in Academic Conferences

Embarking on your first academic conference experience? Fear not, we got you covered! Academic conferences…

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for…

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right…

Research Recommendations – Guiding policy-makers for evidence-based decision making

research work on population

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

research work on population

What should universities' stance be on AI tools in research and academic writing?

Introduction to Research Methods

7 samples and populations.

So you’ve developed your research question, figured out how you’re going to measure whatever you want to study, and have your survey or interviews ready to go. Now all your need is other people to become your data.

You might say ‘easy!’, there’s people all around you. You have a big family tree and surely them and their friends would have happy to take your survey. And then there’s your friends and people you’re in class with. Finding people is way easier than writing the interview questions or developing the survey. That reaction might be a strawman, maybe you’ve come to the conclusion none of this is easy. For your data to be valuable, you not only have to ask the right questions, you have to ask the right people. The “right people” aren’t the best or the smartest people, the right people are driven by what your study is trying to answer and the method you’re using to answer it.

Remember way back in chapter 2 when we looked at this chart and discussed the differences between qualitative and quantitative data.

One of the biggest differences between quantitative and qualitative data was whether we wanted to be able to explain something for a lot of people (what percentage of residents in Oklahoma support legalizing marijuana?) versus explaining the reasons for those opinions (why do some people support legalizing marijuana and others not?). The underlying differences there is whether our goal is explain something about everyone, or whether we’re content to explain it about just our respondents.

‘Everyone’ is called the population . The population in research is whatever group the research is trying to answer questions about. The population could be everyone on planet Earth, everyone in the United States, everyone in rural counties of Iowa, everyone at your university, and on and on. It is simply everyone within the unit you are intending to study.

In order to study the population, we typically take a sample or a subset. A sample is simply a smaller number of people from the population that are studied, which we can use to then understand the characteristics of the population based on that subset. That’s why a poll of 1300 likely voters can be used to guess at who will win your states Governor race. It isn’t perfect, and we’ll talk about the math behind all of it in a later chapter, but for now we’ll just focus on the different types of samples you might use to study a population with a survey.

If correctly sampled, we can use the sample to generalize information we get to the population. Generalizability , which we defined earlier, means we can assume the responses of people to our study match the responses everyone would have given us. We can only do that if the sample is representative of the population, meaning that they are alike on important characteristics such as race, gender, age, education. If something makes a large difference in people’s views on a topic in your research and your sample is not balanced, you’ll get inaccurate results.

Generalizability is more of a concern with surveys than with interviews. The goal of a survey is to explain something about people beyond the sample you get responses from. You’ll never see a news headline saying that “53% of 1250 Americans that responded to a poll approve of the President”. It’s only worth asking those 1250 people if we can assume the rest of the United States feels the same way overall. With interviews though we’re looking for depth from their responses, and so we are less hopefully that the 15 people we talk to will exactly match the American population. That doesn’t mean the data we collect from interviews doesn’t have value, it just has different uses.

There are two broad types of samples, with several different techniques clustered below those. Probability sampling is associated with surveys, and non-probability sampling is often used when conducting interviews. We’ll first describe probability samples, before discussing the non-probability options.

The type of sampling you’ll use will be based on the type of research you’re intending to do. There’s no sample that’s right or wrong, they can just be more or less appropriate for the question you’re trying to answer. And if you use a less appropriate sampling strategy, the answer you get through your research is less likely to be accurate.

7.1 Types of Probability Samples

So we just hinted at the idea that depending on the sample you use, you can generalize the data you collect from the sample to the population. That will depend though on whether your sample represents the population. To ensure that your sample is representative of the population, you will want to use a probability sample. A representative sample refers to whether the characteristics (race, age, income, education, etc) of the sample are the same as the population. Probability sampling is a sampling technique in which every individual in the population has an equal chance of being selected as a subject for the research.

There are several different types of probability samples you can use, depending on the resources you have available.

Let’s start with a simple random sample . In order to use a simple random sample all you have to do is take everyone in your population, throw them in a hat (not literally, you can just throw their names in a hat), and choose the number of names you want to use for your sample. By drawing blindly, you can eliminate human bias in constructing the sample and your sample should represent the population from which it is being taken.

However, a simple random sample isn’t quite that easy to build. The biggest issue is that you have to know who everyone is in order to randomly select them. What that requires is a sampling frame , a list of all residents in the population. But we don’t always have that. There is no list of residents of New York City (or any other city). Organizations that do have such a list wont just give it away. Try to ask your university for a list and contact information of everyone at your school so you can do a survey? They wont give it to you, for privacy reasons. It’s actually harder to think of popultions you could easily develop a sample frame for than those you can’t. If you can get or build a sampling frame, the work of a simple random sample is fairly simple, but that’s the biggest challenge.

Most of the time a true sampling frame is impossible to acquire, so researcher have to settle for something approximating a complete list. Earlier generations of researchers could use the random dial method to contact a random sample of Americans, because every household had a single phone. To use it you just pick up the phone and dial random numbers. Assuming the numbers are actually random, anyone might be called. That method actually worked somewhat well, until people stopped having home phone numbers and eventually stopped answering the phone. It’s a fun mental exercise to think about how you would go about creating a sampling frame for different groups though; think through where you would look to find a list of everyone in these groups:

Plumbers Recent first-time fathers Members of gyms

The best way to get an actual sampling frame is likely to purchase one from a private company that buys data on people from all the different websites we use.

Let’s say you do have a sampling frame though. For instance, you might be hired to do a survey of members of the Republican Party in the state of Utah to understand their political priorities this year, and the organization could give you a list of their members because they’ve hired you to do the reserach. One method of constructing a simple random sample would be to assign each name on the list a number, and then produce a list of random numbers. Once you’ve matched the random numbers to the list, you’ve got your sample. See the example using the list of 20 names below

research work on population

and the list of 5 random numbers.

research work on population

Systematic sampling is similar to simple random sampling in that it begins with a list of the population, but instead of choosing random numbers one would select every kth name on the list. What the heck is a kth? K just refers to how far apart the names are on the list you’re selecting. So if you want to sample one-tenth of the population, you’d select every tenth name. In order to know the k for your study you need to know your sample size (say 1000) and the size of the population (75000). You can divide the size of the population by the sample (75000/1000), which will produce your k (750). As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method, but its only advantage over the random sampling technique is simplicity. If we used the same list as above and wanted to survey 1/5th of the population, we’d include 4 of the names on the list. It’s important with systematic samples to randomize the starting point in the list, otherwise people with A names will be oversampled. If we started with the 3rd name, we’d select Annabelle Frye, Cristobal Padilla, Jennie Vang, and Virginia Guzman, as shown below. So in order to use a systematic sample, we need three things, the population size (denoted as N ), the sample size we want ( n ) and k , which we calculate by dividing the population by the sample).

N= 20 (Population Size) n= 4 (Sample Size) k= 5 {20/4 (kth element) selection interval}

research work on population

We can also use a stratified sample , but that requires knowing more about the population than just their names. A stratified sample divides the study population into relevant subgroups, and then draws a sample from each subgroup. Stratified sampling can be used if you’re very concerned about ensuring balance in the sample or there may be a problem of underrepresentation among certain groups when responses are received. Not everyone in your sample is equally likely to answer a survey. Say for instance we’re trying to predict who will win an election in a county with three cities. In city A there are 1 million college students, in city B there are 2 million families, and in City C there are 3 million retirees. You know that retirees are more likely than busy college students or parents to respond to a poll. So you break the sample into three parts, ensuring that you get 100 responses from City A, 200 from City B, and 300 from City C, so the three cities would match the population. A stratified sample provides the researcher control over the subgroups that are included in the sample, whereas simple random sampling does not guarantee that any one type of person will be included in the final sample. A disadvantage is that it is more complex to organize and analyze the results compared to simple random sampling.

Cluster sampling is an approach that begins by sampling groups (or clusters) of population elements and then selects elements from within those groups. A researcher would use cluster sampling if getting access to elements in an entrie population is too challenging. For instance, a study on students in schools would probably benefit from randomly selecting from all students at the 36 elementary schools in a fictional city. But getting contact information for all students would be very difficult. So the researcher might work with principals at several schools and survey those students. The researcher would need to ensure that the students surveyed at the schools are similar to students throughout the entire city, and greater access and participation within each cluster may make that possible.

The image below shows how this can work, although the example is oversimplified. Say we have 12 students that are in 6 classrooms. The school is in total 1/4th green (3/12), 1/4th yellow (3/12), and half blue (6/12). By selecting the right clusters from within the school our sample can be representative of the entire school, assuming these colors are the only significant difference between the students. In the real world, you’d want to match the clusters and population based on race, gender, age, income, etc. And I should point out that this is an overly simplified example. What if 5/12s of the school was yellow and 1/12th was green, how would I get the right proportions? I couldn’t, but you’d do the best you could. You still wouldn’t want 4 yellows in the sample, you’d just try to approximiate the population characteristics as best you can.

research work on population

7.2 Actually Doing a Survey

All of that probably sounds pretty complicated. Identifying your population shouldn’t be too difficult, but how would you ever get a sampling frame? And then actually identifying who to include… It’s probably a bit overwhelming and makes doing a good survey sound impossible.

Researchers using surveys aren’t superhuman though. Often times, they use a little help. Because surveys are really valuable, and because researchers rely on them pretty often, there has been substantial growth in companies that can help to get one’s survey to its intended audience.

One popular resource is Amazon’s Mechanical Turk (more commonly known as MTurk). MTurk is at its most basic a website where workers look for jobs (called hits) to be listed by employers, and choose whether to do the task or not for a set reward. MTurk has grown over the last decade to be a common source of survey participants in the social sciences, in part because hiring workers costs very little (you can get some surveys completed for penny’s). That means you can get your survey completed with a small grant ($1-2k at the low end) and get the data back in a few hours. Really, it’s a quick and easy way to run a survey.

However, the workers aren’t perfectly representative of the average American. For instance, researchers have found that MTurk respondents are younger, better educated, and earn less than the average American.

One way to get around that issue, which can be used with MTurk or any survey, is to weight the responses. Because with MTurk you’ll get fewer responses from older, less educated, and richer Americans, those responses you do give you want to count for more to make your sample more representative of the population. Oversimplified example incoming!

Imagine you’re setting up a pizza party for your class. There are 9 people in your class, 4 men and 5 women. You only got 4 responses from the men, and 3 from the women. All 4 men wanted peperoni pizza, while the 3 women want a combination. Pepperoni wins right, 4 to 3? Not if you assume that the people that didn’t respond are the same as the ones that did. If you weight the responses to match the population (the full class of 9), a combination pizza is the winner.

research work on population

Because you know the population of women is 5, you can weight the 3 responses from women by 5/3 = 1.6667. If we weight (or multiply) each vote we did receive from a woman by 1.6667, each vote for a combination now equals 1.6667, meaning that the 3 votes for combination total 5. Because we received a vote from every man in the class, we just weight their votes by 1. The big assumption we have to make is that the people we didn’t hear from (the 2 women that didn’t vote) are similar to the ones we did hear from. And if we don’t get any responses from a group we don’t have anything to infer their preferences or views from.

Let’s go through a slightly more complex example, still just considering one quality about people in the class. Let’s say your class actually has 100 students, but you only received votes from 50. And, what type of pizza people voted for is mixed, but men still prefer peperoni overall, and women still prefer combination. The class is 60% female and 40% male.

We received 21 votes from women out of the 60, so we can weight their responses by 60/21 to represent the population. We got 29 votes out of the 40 for men, so their responses can be weighted by 40/29. See the math below.

research work on population

53.8 votes for combination? That might seem a little odd, but weighting isn’t a perfect science. We can’t identify what a non-respondent would have said exactly, all we can do is use the responses of other similar people to make a good guess. That issue often comes up in polling, where pollsters have to guess who is going to vote in a given election in order to project who will win. And we can weight on any characteristic of a person we think will be important, alone or in combination. Modern polls weight on age, gender, voting habits, education, and more to make the results as generalizable as possible.

There’s an appendix later in this book where I walk through the actual steps of creating weights for a sample in R, if anyone actually does a survey. I intended this section to show that doing a good survey might be simpler than it seemed, but now it might sound even more difficult. A good lesson to take though is that there’s always another door to go through, another hurdle to improve your methods. Being good at research just means being constantly prepared to be given a new challenge, and being able to find another solution.

7.3 Non-Probability Sampling

Qualitative researchers’ main objective is to gain an in-depth understanding on the subject matter they are studying, rather than attempting to generalize results to the population. As such, non-probability sampling is more common because of the researchers desire to gain information not from random elements of the population, but rather from specific individuals.

Random selection is not used in nonprobability sampling. Instead, the personal judgment of the researcher determines who will be included in the sample. Typically, researchers may base their selection on availability, quotas, or other criteria. However, not all members of the population are given an equal chance to be included in the sample. This nonrandom approach results in not knowing whether the sample represents the entire population. Consequently, researchers are not able to make valid generalizations about the population.

As with probability sampling, there are several types of non-probability samples. Convenience sampling , also known as accidental or opportunity sampling, is a process of choosing a sample that is easily accessible and readily available to the researcher. Researchers tend to collect samples from convenient locations such as their place of employment, a location, school, or other close affiliation. Although this technique allows for quick and easy access to available participants, a large part of the population is excluded from the sample.

For example, researchers (particularly in psychology) often rely on research subjects that are at their universities. That is highly convenient, students are cheap to hire and readily available on campuses. However, it means the results of the study may have limited ability to predict motivations or behaviors of people that aren’t included in the sample, i.e., people outside the age of 18-22 that are going to college.

If I ask you to get find out whether people approve of the mayor or not, and tell you I want 500 people’s opinions, should you go stand in front of the local grocery store? That would be convinient, and the people coming will be random, right? Not really. If you stand outside a rural Piggly Wiggly or an urban Whole Foods, do you think you’ll see the same people? Probably not, people’s chracteristics make the more or less likely to be in those locations. This technique runs the high risk of over- or under-representation, biased results, as well as an inability to make generalizations about the larger population. As the name implies though, it is convenient.

Purposive sampling , also known as judgmental or selective sampling, refers to a method in which the researcher decides who will be selected for the sample based on who or what is relevant to the study’s purpose. The researcher must first identify a specific characteristic of the population that can best help answer the research question. Then, they can deliberately select a sample that meets that particular criterion. Typically, the sample is small with very specific experiences and perspectives. For instance, if I wanted to understand the experiences of prominent foreign-born politicians in the United States, I would purposefully build a sample of… prominent foreign-born politicians in the United States. That would exclude anyone that was born in the United States or and that wasn’t a politician, and I’d have to define what I meant by prominent. Purposive sampling is susceptible to errors in judgment by the researcher and selection bias due to a lack of random sampling, but when attempting to research small communities it can be effective.

When dealing with small and difficult to reach communities researchers sometimes use snowball samples , also known as chain referral sampling. Snowball sampling is a process in which the researcher selects an initial participant for the sample, then asks that participant to recruit or refer additional participants who have similar traits as them. The cycle continues until the needed sample size is obtained.

This technique is used when the study calls for participants who are hard to find because of a unique or rare quality or when a participant does not want to be found because they are part of a stigmatized group or behavior. Examples may include people with rare diseases, sex workers, or a child sex offenders. It would be impossible to find an accurate list of sex workers anywhere, and surveying the general population about whether that is their job will produce false responses as people will be unwilling to identify themselves. As such, a common method is to gain the trust of one individual within the community, who can then introduce you to others. It is important that the researcher builds rapport and gains trust so that participants can be comfortable contributing to the study, but that must also be balanced by mainting objectivity in the research.

Snowball sampling is a useful method for locating hard to reach populations but cannot guarantee a representative sample because each contact will be based upon your last. For instance, let’s say you’re studying illegal fight clubs in your state. Some fight clubs allow weapons in the fights, while others completely ban them; those two types of clubs never interreact because of their disagreement about whether weapons should be allowed, and there’s no overlap between them (no members in both type of club). If your initial contact is with a club that uses weapons, all of your subsequent contacts will be within that community and so you’ll never understand the differences. If you didn’t know there were two types of clubs when you started, you’ll never even know you’re only researching half of the community. As such, snowball sampling can be a necessary technique when there are no other options, but it does have limitations.

Quota Sampling is a process in which the researcher must first divide a population into mutually exclusive subgroups, similar to stratified sampling. Depending on what is relevant to the study, subgroups can be based on a known characteristic such as age, race, gender, etc. Secondly, the researcher must select a sample from each subgroup to fit their predefined quotas. Quota sampling is used for the same reason as stratified sampling, to ensure that your sample has representation of certain groups. For instance, let’s say that you’re studying sexual harassment in the workplace, and men are much more willing to discuss their experiences than women. You might choose to decide that half of your final sample will be women, and stop requesting interviews with men once you fill your quota. The core difference is that while stratified sampling chooses randomly from within the different groups, quota sampling does not. A quota sample can either be proportional or non-proportional . Proportional quota sampling refers to ensuring that the quotas in the sample match the population (if 35% of the company is female, 35% of the sample should be female). Non-proportional sampling allows you to select your own quota sizes. If you think the experiences of females with sexual harassment are more important to your research, you can include whatever percentage of females you desire.

7.4 Dangers in sampling

Now that we’ve described all the different ways that one could create a sample, we can talk more about the pitfalls of sampling. Ensuring a quality sample means asking yourself some basic questions:

  • Who is in the sample?
  • How were they sampled?
  • Why were they sampled?

A meal is often only as good as the ingredients you use, and your data will only be as good as the sample. If you collect data from the wrong people, you’ll get the wrong answer. You’ll still get an answer, it’ll just be inaccurate. And I want to reemphasize here wrong people just refers to inappropriate for your study. If I want to study bullying in middle schools, but I only talk to people that live in a retirement home, how accurate or relevant will the information I gather be? Sure, they might have grandchildren in middle school, and they may remember their experiences. But wouldn’t my information be more relevant if I talked to students in middle school, or perhaps a mix of teachers, parents, and students? I’ll get an answer from retirees, but it wont be the one I need. The sample has to be appropriate to the research question.

Is a bigger sample always better? Not necessarily. A larger sample can be useful, but a more representative one of the population is better. That was made painfully clear when the magazine Literary Digest ran a poll to predict who would win the 1936 presidential election between Alf Landon and incumbent Franklin Roosevelt. Literary Digest had run the poll since 1916, and had been correct in predicting the outcome every time. It was the largest poll ever, and they received responses for 2.27 million people. They essentially received responses from 1 percent of the American population, while many modern polls use only 1000 responses for a much more populous country. What did they predict? They showed that Alf Landon would be the overwhelming winner, yet when the election was held Roosevelt won every state except Maine and Vermont. It was one of the most decisive victories in Presidential history.

So what went wrong for the Literary Digest? Their poll was large (gigantic!), but it wasn’t representative of likely voters. They polled their own readership, which tended to be more educated and wealthy on average, along with people on a list of those with registered automobiles and telephone users (both of which tended to be owned by the wealthy at that time). Thus, the poll largely ignored the majority of Americans, who ended up voting for Roosevelt. The Literary Digest poll is famous for being wrong, but led to significant improvements in the science of polling to avoid similar mistakes in the future. Researchers have learned a lot in the century since that mistake, even if polling and surveys still aren’t (and can’t be) perfect.

What kind of sampling strategy did Literary Digest use? Convenience, they relied on lists they had available, rather than try to ensure every American was included on their list. A representative poll of 2 million people will give you more accurate results than a representative poll of 2 thousand, but I’ll take the smaller more representative poll than a larger one that uses convenience sampling any day.

7.5 Summary

Picking the right type of sample is critical to getting an accurate answer to your reserach question. There are a lot of differnet options in how you can select the people to participate in your research, but typically only one that is both correct and possible depending on the research you’re doing. In the next chapter we’ll talk about a few other methods for conducting reseach, some that don’t include any sampling by you.

Study Population

  • Reference work entry
  • pp 6412–6414
  • Cite this reference work entry

Book cover

3143 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Babbie, E. R. (2010). The practice of social research . Belmont, CA: Wadsworth Publishing Company.

Google Scholar  

Bickman, L., & Rog, D. J. (1998). Handbook of applied social research methods . Thousand Oaks, CA: Sage Publications.

Friedman, L. M., Furberg, C. D., & DeMets, D. L. (2010). Fundamentals of clinical trials . New York: Springer.

Gerrish, K., & Lacey, A. (2010). The research process in nursing . West Sussex: Wiley-Blackwell.

Henry, G. T. (1990). Practical sampling . Newbury Park, CA: Sage Publications.

Kumar, R. (2011). Research methodology: A step-by-step guide for beginners . London: Sage Publications Limited.

Riegelman, R. K. (2005). Studying a study and testing a test: How to read the medical evidence . Philadelphia: Lippincott Williams & Wilkins.

Download references

Author information

Authors and affiliations.

Sociology Department, National University of Singapore, 11 Arts Link, 117570, Singapore, Singapore

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Shu Hu .

Editor information

Editors and affiliations.

University of Northern British Columbia, Prince George, BC, Canada

Alex C. Michalos

(residence), Brandon, MB, Canada

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media Dordrecht

About this entry

Cite this entry.

Hu, S. (2014). Study Population. In: Michalos, A.C. (eds) Encyclopedia of Quality of Life and Well-Being Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-0753-5_2893

Download citation

DOI : https://doi.org/10.1007/978-94-007-0753-5_2893

Publisher Name : Springer, Dordrecht

Print ISBN : 978-94-007-0752-8

Online ISBN : 978-94-007-0753-5

eBook Packages : Humanities, Social Sciences and Law

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Human Geography

  • Defining human geography
  • Cultural geography
  • Economic geography
  • Feminist geography
  • Migration and detention
  • Statelessness
  • Feminist political geography
  • Geopolitics
  • Population studies
  • Travel and tourism
  • Gentrification
  • Scholarly communication This link opens in a new window

Subject Librarian

Profile Photo

Other library resource(s)

Book

A short definition for Population Geography

The geographical study of population, including its spatial distribution, dynamics, and movement. As a subdiscipline, it has taken at least three distinct but related forms, the most recent of which appears increasingly integrated with human geography in general. The earliest and most enduring form of population geography emerged from the 1950s onwards, as part of spatial science . Pioneered by Glenn Trewartha, Wilbur Zelinsky , William A. V. Clark, and others in the USA, as well as Jacqueline Beujeau-Garnier and Pierre George in France, it focused on the systematic study of the distribution of population as a whole and the spatial variation in population characteristics such as fertility and mortality . Given the rapidly growing global population as well as the baby boom in affluent countries such as the USA, these geographers studied the relation between demographic growth and resources at an international scale, and population redistribution nationally ( see demographic transition ). An exemplary contribution might be Zelinksy’s mobility transition model (1971) linking migration and demographic change. They used secondary data sources such as censuses to map and describe population change and variation, including such trends as counter-urbanization . Such work could often be distinguished from population studies in general by its use of smaller scale data, below national level. Population projections at national and regional scales could be used to inform public policy debates on resource allocation. The increasing availability of more sophisticated spatial data, including more flexible census geographies, inter-censual surveys, and more detailed cross-tabulations such as the US Public-Use Microdata Samples encouraged more advanced modelling, simulation, and projection techniques ( see geodemographics ). This broad population geography has always been international and therefore comparative in scope, particularly under the auspices of the IGU Commission on Population Geography. To some extent, however, progress in the Global South has been held back by the poor availability of high-quality spatial data (Hugo 2006). Regular international conferences in population geography began in 2002.
A second variant of population geography is narrower in focus, akin to spatial demography. Geographers working in this field stressed the importance of keeping close to demography, its theories and methods, and therefore concentrating more on the core demographic variables of fertility, mortality, and, to a lesser extent, migration. They applied mathematical techniques to describe, infer, and also explain population patterns past and present. A volume edited by British geographers Bob Woods and Phil Rees (1986) Population Structures and Models: Developments in spatial demography typifies this approach. Woods’ own specialism was the historical demography of infant mortality in Victorian Britain. Spatial demography has a strong historical component, not least among French and British geographers. By detailing the spatial (and temporal) variation in mortality, fertility, nuptuality, etc., geographers were able to disrupt many of the generalizations of population change and identify the significance of place.
Many population geographers from the 1980s onwards expressed anxiety that they were marginalized from mainstream human geography and its embrace of social theories from Marxism to feminism , and postmodernism (Findlay and Graham 1991). Not enough research was being done on key issues such as famine, gender, and environment. They also sensed that other human geographers were overlooking the significance of population to wider processes. A ‘retheorization’ of population geography (White and Jackson) gradually took shape, involving more methodological diversity and theoretical plurality. New methods, such as lifecourse analysis , helped integrate biographical and individual-level studies into the field. In recent years there has been greater attention paid to gender, religion, age, disability, generation, sexuality, and race, variables which go beyond the vital statistics of births, deaths, and marriages. Furthermore, population geographers have begun to critique the standard census categories of the field, recognizing the social construction of childhood, whiteness , femininity, etc. Representative of this more theoretical approach is James Tyner’s (2009) War, Violence and Population: making the body count . Tyner argues that population geography should pay more attention to war and violence, using examples from the Vietnam War, Cambodia’s killing fields, and the Rwandan genocide. Grounded in post-colonialism and post-structuralism , he deploys Foucault’s concepts of biopower and disciplinary power to uncover the logics behind such violence.
This more recent form of population geography is increasingly aligned with human geography as a whole. One consequence has been the relative neglect of studies of fertility, mortality, and morbidity , the latter becoming the preserve of medical geography. Of the core demographic topics, migration continued to be the most central to population geographers; most of the papers in the main population geography journals, Population, Space and Place (launched in 1995 as The International Journal of Population Geography ) and Espace, Populations, Sociétés (founded 1983), concern migration and related topics such as transnationalism .
All three forms of population geography outlined here continue side by side. Spatial and historical demography is making increasing use of data sources from outside Europe. Popular textbooks such as Population Geography: Problems, Concept and Prospects (Peters and Larkin 2010) teach new generations the basics of the subject. By contrast, Adrian Bailey’s (2005) Making Population Geography presents a broader, more theoretically informed perspective. Recent conferences and journal special issues have focused on climate change, neo-Malthusianism, children’s geographies, vulnerability, and difference, although migration continues to predominate.

Rogers, A., Castree, N., & Kitchin, R. (2013). "Population geography ." In  A Dictionary of Human Geography . Oxford University Press. Retrieved 25 Jan. 2022

The study of human populations; their composition, growth, distribution, and migratory movements with an emphasis on the last two. It is concerned with the study of demographic processes which affect the environment, but differs from demography in that it is concerned with the spatial expression of such processes. Population, Space and Place is the journal of the UK Population Geography Research Group.

Mayhew, S. (2015). " Population geography ." In  A Dictionary of Geography . Oxford University Press. Retrieved 25 Jan. 2022

DEMOGRAPHY The observed, statistical, and mathematical study of human populations, concerned with the size, distribution, and composition of such populations.

Mayhew, S. (2015). "Demography ." In  A Dictionary of Geography . Oxford University Press. Retrieved 25 Jan. 2022 

In the Library's collections

    Many of the books on Population Studies and Demography are located in the call number range HB 848 through HB 3697 on Berry Level 3 .

    To browse in the library's catalog, do a subject search for Population . That will give a list of the subject headings under Population and the number of items under each heading. You can also do the same for Demography .

  • population forecasting
  • overpopulation
  • malthusianism
  • sex preselection

Introductory reading(s)

Cover art

Selected book titles

Cover art

Journal articles & titles

Articles and other writings about Population Studies can be found in many publications. Our collection includes several journals which look at Population Studies and Demography. Below is a short list of some of the journal titles we have in our Library's collection. Or you can use the search box at the top of the page.

Issue cover art

Internet resource(s)

Link

Keeping up with the journal literature

research work on population

You can get the app from the App Store or Google Play.

Don't own or use a mobile device? You can still use BrowZine! It's now available in a web version. You can get to it here . The web version works the same way as the app version. Find the journals you like, create a custom Bookshelf, get ToCs and read the articles you want.

  • << Previous: Geopolitics
  • Next: Place >>
  • Last Updated: Feb 23, 2024 3:04 PM
  • URL: https://researchguides.dartmouth.edu/human_geography

Oxford Martin School logo

Population Growth

Population growth is one of the most important topics we cover at Our World in Data .

For most of human history, the global population was a tiny fraction of what it is today. Over the last few centuries, the human population has gone through an extraordinary change. In 1800, there were one billion people. Today there are more than 8 billion of us.

But after a period of very fast population growth, demographers expect the world population to peak by the end of this century.

On this page, you will find all of our data, charts, and writing on changes in population growth. This includes how populations are distributed worldwide, how this has changed, and what demographers expect for the future.

Related topics

  • Child Mortality
  • Fertility Rate
  • Life Expectancy
  • Age Structure

Key insights on Population Growth

Population cartograms show us where the world’s people are.

Geographical maps show us where the world’s landmasses are; not where people are. That means they don’t always give us an accurate picture of how global living standards are changing.

One way to understand the distribution of people worldwide is to redraw the world map – not based on the area but according to population.

This is shown here as a population cartogram : a geographical presentation of the world where the size of countries is not drawn according to the distribution of land but by the distribution of people. It’s shown for the year 2018.

As the population size rather than the territory is shown in this map, you can see some significant differences when you compare it to the standard geographical map we’re most familiar with. 

Small countries with a high population density increase in size in this cartogram relative to the world maps we are used to – look at Bangladesh, Taiwan, or the Netherlands. Large countries with a small population shrink in size – look for Canada, Mongolia, Australia, or Russia.

You can find more details on this cartogram in our article about it:

Population cartogram world 2 e1538912000147

What you should know about this data

  • This map is based on the United Nation’s 2017 World Population Prospects report. Our interactive charts show population data from the most recent UN revision. This means there may be minor differences between the figures shown on the map and the latest estimates in our other charts.

Population cartogram world

The world population has increased rapidly over the last few centuries

The speed of global population growth over the last few centuries has been staggering. For most of human history, the world population was well under one million. 1

As recently as 12,000 years ago, there were only 4 million people worldwide.

The chart shows the rapid increase in the global population since 1700. 

The one-billion mark wasn’t broken until the early 1800s. It was only a century ago that there were 2 billion people.

Since then, the global population has quadrupled to eight billion.

Around 108 billion people have ever lived on our planet. This means that today’s population size makes up 6.5% of the total number of people ever born. 2

This increase has been the result of advances in living conditions and health that reduced death rates – especially in children – and increases in life expectancy.

  • This data comes from a combination of sources, all detailed in our sources article for our long-term population dataset.

Annual world population since 10 thousand bce 1

Population growth is no longer exponential – it peaked decades ago

There’s a popular misconception that the global population is growing exponentially. But it’s not.

While the global population is still increasing in absolute numbers, population growth peaked decades ago.

In the chart, we see the global population growth rate per year. This is based on historical UN estimates and its medium projection to 2100.

Global population growth peaked in the 1960s at over 2% per year. Since then, rates have more than halved, falling to less than 1%. 

The UN expects rates to continue to fall until the end of the century. In fact, towards the end of the century, it projects negative growth, meaning the global population will shrink instead of grow.

Global population growth, in absolute terms – which is the number of births minus the number of deaths – has also peaked. You can see this in our interactive chart:

2019 revision – world population growth 1700 2100

The world has passed “peak child”

Hans Rosling famously coined the term “ peak child ” for the moment in global demographic history when the number of children stopped increasing.

According to the UN data, the world has passed “peak child”, which is defined as the number of children under the age of five.

The chart shows the UN’s historical estimates and projections of the number of children under five.

It estimates that the number of children in the world peaked in 2017. For the coming decades, demographers expect a decades-long plateau before the number will decline more rapidly in the second half of the century.

  • These projections are sensitive to the assumptions made about future fertility rates worldwide. Find out more from the UN World Population Division .
  • Other sources and scenarios in the UN’s projections suggest that the peak was reached slightly earlier or later. However, most indicate that the world is close to “peak child” and the number of children will not increase in the coming decades.
  • The ‘ups and downs’ in this chart reflect generational effects and ‘baby booms’ when there are large cohorts of women of reproductive age, and high fertility rates. The timing of these transitions varies across the world.

The UN expects the global population to peak by the end of the century

When will population growth come to an end?

The UN’s historical estimates and latest projections for the global population are shown in the chart.

The UN projects that the global population will peak before the end of the century – in 2086, at just over 10.4 billion people.

  • These projections are sensitive to the assumptions made about future fertility and mortality rates worldwide. Find out more from the UN World Population Division .
  • Other sources and scenarios in the UN’s projections can produce a slightly earlier or later peak. Most demographers, however, expect that by the end of the century, the global population will have peaked or slowed so much that population growth will be small.

Explore data on Population Growth

Research & writing.

Population cartogram world

What would the work look like if each country’s area was in proportion to its population?

Featured image world population growth

The world population has increased rapidly in recent centuries. But this is slowing.

Max Roser and Hannah Ritchie

More Key articles on Population Growth

How many people die and how many are born each year.

Hannah Ritchie and Edouard Mathieu

Five key findings from the 2022 UN Population Prospects

Hannah Ritchie, Edouard Mathieu and Lucas Rodés-Guirao

Which countries are most densely populated?

Demographic change.

Screen shot 2021 11 21 at 21.06.10

Hannah Ritchie

Future population region featured 01

Definitions and sources

Population sources featured 01

Edouard Mathieu and Lucas Rodés-Guirao

Population projections thumbnail 01

Other articles related to population growth

Famine victims and world population since 1860

Interactive charts on Population Growth

Our World in Data is free and accessible for everyone.

Help us do this work by making a donation.

  • Foundations
  • Write Paper

Search form

  • Experiments
  • Anthropology
  • Self-Esteem
  • Social Anxiety

research work on population

Research Population

All research questions address issues that are of great relevance to important groups of individuals known as a research population.

This article is a part of the guide:

  • Non-Probability Sampling
  • Convenience Sampling
  • Random Sampling
  • Stratified Sampling
  • Systematic Sampling

Browse Full Outline

  • 1 What is Sampling?
  • 2.1 Sample Group
  • 2.2 Research Population
  • 2.3 Sample Size
  • 2.4 Randomization
  • 3.1 Statistical Sampling
  • 3.2 Sampling Distribution
  • 3.3.1 Random Sampling Error
  • 4.1 Random Sampling
  • 4.2 Stratified Sampling
  • 4.3 Systematic Sampling
  • 4.4 Cluster Sampling
  • 4.5 Disproportional Sampling
  • 5.1 Convenience Sampling
  • 5.2 Sequential Sampling
  • 5.3 Quota Sampling
  • 5.4 Judgmental Sampling
  • 5.5 Snowball Sampling

A research population is generally a large collection of individuals or objects that is the main focus of a scientific query. It is for the benefit of the population that researches are done. However, due to the large sizes of populations, researchers often cannot test every individual in the population because it is too expensive and time-consuming. This is the reason why researchers rely on sampling techniques .

A research population is also known as a well-defined collection of individuals or objects known to have similar characteristics. All individuals or objects within a certain population usually have a common, binding characteristic or trait.

Usually, the description of the population and the common binding characteristic of its members are the same. "Government officials" is a well-defined group of individuals which can be considered as a population and all the members of this population are indeed officials of the government.

research work on population

Relationship of Sample and Population in Research

A sample is simply a subset of the population. The concept of sample arises from the inability of the researchers to test all the individuals in a given population. The sample must be representative of the population from which it was drawn and it must have good size to warrant statistical analysis.

The main function of the sample is to allow the researchers to conduct the study to individuals from the population so that the results of their study can be used to derive conclusions that will apply to the entire population. It is much like a give-and-take process. The population “gives” the sample, and then it “takes” conclusions from the results obtained from the sample.

research work on population

Two Types of Population in Research

Target population.

Target population refers to the ENTIRE group of individuals or objects to which researchers are interested in generalizing the conclusions. The target population usually has varying characteristics and it is also known as the theoretical population.

Accessible Population

The accessible population is the population in research to which the researchers can apply their conclusions. This population is a subset of the target population and is also known as the study population. It is from the accessible population that researchers draw their samples.

  • Psychology 101
  • Flags and Countries
  • Capitals and Countries

Explorable.com (Nov 15, 2009). Research Population. Retrieved Apr 11, 2024 from Explorable.com: https://explorable.com/research-population

You Are Allowed To Copy The Text

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0) .

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give appropriate credit and provide a link/reference to this page.

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

Want to stay up to date? Follow us!

Save this course for later.

Don't have time for it all now? No problem, save it as a course and come back to it later.

Footer bottom

  • Privacy Policy

research work on population

  • Subscribe to our RSS Feed
  • Like us on Facebook
  • Follow us on Twitter
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

research work on population

Home Market Research Research Tools and Apps

Study Population: Characteristics & Sampling Techniques

study population

How do you define a study population?  Research studies require specific groups to draw conclusions and make decisions based on their results. This group of interest is known as a sample. The method used to select respondents is known as sampling.

What is a Study Population?

A study population is a group considered for a study or statistical reasoning. The study population is not limited to the human population only. It is a set of aspects that have something in common. They can be objects, animals, measurements, etc., with many characteristics within a group.

For example, suppose you are interested in the average time a person between the ages of 30 and 35 takes to recover from a particular condition after consuming a specific type of medication. In that case, the study population will be all people between the ages of 30 and 35.

A medical study examines the spread of a specific disease in stray dogs in a city. Here, the stray dogs belonging to that city are the study population. This population or sample represents the entire population you want to conclude about.

How to establish a study population?

Sampling is a powerful technique for collecting opinions from a wide range of people, chosen from a particular group, to learn more about the whole group in general.

For any research study to be effective, it is necessary to select the study population that truly represents the entire population. Before starting your study, the target population must be identified and agreed upon. By appointing and knowing your sample well in advance, any feedback deemed useless to the study will be largely eliminated.

If your survey aims to understand a product’s or service’s effectiveness, then the study population should be the customers who have used it or are best suited to their needs and who will use the product/service.

It would be costly and time-consuming to collect data from the entire population of your target market. By accurately sampling your study population, it is possible to build a true picture of the target market using the trends in the results.

LEARN ABOUT: Survey Sampling

Choosing an accurate sample from the study population

The decision on an appropriate sample depends on several key factors.

  • First, you decide which population parameters you want to estimate.
  • Don’t expect estimates from a sample to be exact. Always expect a margin of error when making assumptions based on the results of a sample.
  • Understanding the cost of sampling helps us determine how precise our estimates need to be.
  • Know how variable the population you want to measure is. It is not necessary to assume that a large sample is required if the study population is large.
  • Take into account the response rate of your population. A 20% response rate is considered “good” for an online research study.

Sampling characteristics in the study population

  • Sampling is a mechanism to collect data without surveying the entire target population.
  • The study population is the entire unit of people you consider for your research. A sample is a subset of this group that represents the population.
  • Sampling reduces survey fatigue as it is used to prevent pollsters from conducting too many surveys, thereby increasing response rates.
  • Also, it is much cheaper and saves more time than measuring the entire group.
  • Tracking the response rate patterns of different groups will help determine how many respondents to select.
  • The study is not only limited to the selected part, but is applied to the entire target population.

Sampling techniques for your study population

Now that you understand that you cannot survey the entire study population due to various factors, you should adopt one of the sample selection methodologies that best suits your research study.

In general terms, two methodologies can be applied: probability sampling and non-probability sampling .

Sampling Techniques: Probability Sampling

This method is used to select sample objects from a population based on probability theory. Everyone is included in the sample and has an equal chance of being selected. There is no bias in this type of sample. Every person in the population has the opportunity to be part of the research.

Probability sampling can be categorized into four types:

  • Simple Random Sampling : Simple random sampling is the easiest way to select a sample. Here, each member has an equal chance of being part of the sample. The objects in this sample are chosen at random, and each member has exactly the same probability of being selected.
  • Cluster sampling : Cluster sampling is a method in which respondents are grouped into clusters. These groups can be defined based on age, gender, location, and demographic parameters.
  • Systematic Sampling : In systematic sampling, individuals are chosen at equal intervals from the population. A starting point is selected, and then respondents are chosen at predefined sample intervals.
  • Stratified Sampling: S tratified random sampling is a process of dividing respondents into distinct but predefined parameters. In this method, respondents do not overlap but collectively represent the entire population.

Sampling techniques: Non-probabilistic sampling

The non-probability sampling method uses the researcher’s preference regarding sample selection bias . This sampling method derives primarily from the researcher’s ability to access this sample. Here the population members do not have the same opportunities to be part of the sample.

Non-probability sampling can be further classified into four distinct types:

  • Convenience Sampling: As the name implies, convenience sampling represents the convenience with which the researcher can reach the respondent. The researchers do not have the authority to select the samples and they are done solely for reasons of proximity and not representativeness.
  • Deliberate, critical, or judgmental sampling: In this type of sampling the researcher judges and develops his sample on the nature of the study and the understanding of his target audience. Only people who meet the research criteria and the final objective are selected.
  • Snowball Sampling: As a snowball speeds up, it accumulates more snow around itself. Similarly, with snowball sampling, respondents are tasked with providing references or recruiting samples for the study once their participation ends.
  • Quota Sampling: Quota sampling is a method where the researcher has the privilege to select a sample based on its strata. In this method, two people cannot exist under two different conditions.

LEARN ABOUT: Theoretical Research

Advantages and disadvantages of sampling in a study population

In most cases, of the total study population, perceptions can only be obtained from predefined samples. This comes with its own advantages and disadvantages. Some of them are listed below.

  • Highly accurate – low probability of sampling errors (if sampled well)
  • Economically feasible by nature, highly reliable
  • High fitness ratio to different surveys Takes less time compared to surveying the entire population Reduced resource deployment
  • Data-intensive and comprehensive Properties are applied to a larger population wideIdeal when the study population is vast.

Disadvantages

  • Insufficient samples
  • Possibility of bias
  • Precision problems (if sampling is poor)
  • Difficulty obtaining the typical sample
  • Lack of quality sources
  • Possibility of making mistakes.

At QuestionPro we can help you carry out your study with your study population. Learn about all the features of our online survey software and start conducting your research today!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Employee Engagement App

Employee Engagement App: Top 11 For Workforce Improvement 

Apr 10, 2024

employee evaluation software

Top 15 Employee Evaluation Software to Enhance Performance

event feedback software

Event Feedback Software: Top 11 Best in 2024

Apr 9, 2024

free market research tools

Top 10 Free Market Research Tools to Boost Your Business

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Sampling Methods | Types, Techniques & Examples

Sampling Methods | Types, Techniques & Examples

Published on September 19, 2019 by Shona McCombes . Revised on June 22, 2023.

When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample . The sample is the group of individuals who will actually participate in the research.

To draw valid conclusions from your results, you have to carefully decide how you will select a sample that is representative of the group as a whole. This is called a sampling method . There are two primary types of sampling methods that you can use in your research:

  • Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group.
  • Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data.

You should clearly explain how you selected your sample in the methodology section of your paper or thesis, as well as how you approached minimizing research bias in your work.

Table of contents

Population vs. sample, probability sampling methods, non-probability sampling methods, other interesting articles, frequently asked questions about sampling.

First, you need to understand the difference between a population and a sample , and identify the target population of your research.

  • The population is the entire group that you want to draw conclusions about.
  • The sample is the specific group of individuals that you will collect data from.

The population can be defined in terms of geographical location, age, income, or many other characteristics.

Population vs sample

It is important to carefully define your target population according to the purpose and practicalities of your project.

If the population is very large, demographically mixed, and geographically dispersed, it might be difficult to gain access to a representative sample. A lack of a representative sample affects the validity of your results, and can lead to several research biases , particularly sampling bias .

Sampling frame

The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population).

Sample size

The number of individuals you should include in your sample depends on various factors, including the size and variability of the population and your research design. There are different sample size calculators and formulas depending on what you want to achieve with statistical analysis .

Prevent plagiarism. Run a free check.

Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research . If you want to produce results that are representative of the whole population, probability sampling techniques are the most valid choice.

There are four main types of probability sample.

Probability sampling

1. Simple random sampling

In a simple random sample, every member of the population has an equal chance of being selected. Your sampling frame should include the whole population.

To conduct this type of sampling, you can use tools like random number generators or other techniques that are based entirely on chance.

2. Systematic sampling

Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals.

If you use this technique, it is important to make sure that there is no hidden pattern in the list that might skew the sample. For example, if the HR database groups employees by team, and team members are listed in order of seniority, there is a risk that your interval might skip over people in junior roles, resulting in a sample that is skewed towards senior employees.

3. Stratified sampling

Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the sample.

To use this sampling method, you divide the population into subgroups (called strata) based on the relevant characteristic (e.g., gender identity, age range, income bracket, job role).

Based on the overall proportions of the population, you calculate how many people should be sampled from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup.

4. Cluster sampling

Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select entire subgroups.

If it is practically possible, you might include every individual from each sampled cluster. If the clusters themselves are large, you can also sample individuals from within each cluster using one of the techniques above. This is called multistage sampling .

This method is good for dealing with large and dispersed populations, but there is more risk of error in the sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are really representative of the whole population.

In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a chance of being included.

This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias . That means the inferences you can make about the population are weaker than with probability samples, and your conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative of the population as possible.

Non-probability sampling techniques are often used in exploratory and qualitative research . In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial understanding of a small or under-researched population.

Non probability sampling

1. Convenience sampling

A convenience sample simply includes the individuals who happen to be most accessible to the researcher.

This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results. Convenience samples are at risk for both sampling bias and selection bias .

2. Voluntary response sampling

Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey).

Voluntary response samples are always at least somewhat biased , as some people will inherently be more likely to volunteer than others, leading to self-selection bias .

3. Purposive sampling

This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research.

It is often used in qualitative research , where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statistical inferences, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your arguments.

4. Snowball sampling

If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to “snowballs” as you get in contact with more people. The downside here is also representativeness, as you have no way of knowing how representative your sample is due to the reliance on participants recruiting others. This can lead to sampling bias .

5. Quota sampling

Quota sampling relies on the non-random selection of a predetermined number or proportion of units. This is called a quota.

You first divide the population into mutually exclusive subgroups (called strata) and then recruit sample units until you reach your quota. These units share specific characteristics, determined by you prior to forming your strata. The aim of quota sampling is to control what or who makes up your sample.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, June 22). Sampling Methods | Types, Techniques & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/methodology/sampling-methods/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, population vs. sample | definitions, differences & examples, simple random sampling | definition, steps & examples, sampling bias and how to avoid it | types & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.90(4); 2012 Dec

Who and What Is a “Population”? Historical Debates, Current Controversies, and Implications for Understanding “Population Health” and Rectifying Health Inequities

The idea of “population” is core to the population sciences but is rarely defined except in statistical terms. Yet who and what defines and makes a population has everything to do with whether population means are meaningful or meaningless, with profound implications for work on population health and health inequities.

In this article, I review the current conventional definitions of, and historical debates over, the meaning(s) of “population,” trace back the contemporary emphasis on populations as statistical rather than substantive entities to Adolphe Quetelet's powerful astronomical metaphor, conceived in the 1830s, of l’homme moyen (the average man), and argue for an alternative definition of populations as relational beings. As informed by the ecosocial theory of disease distribution, I then analyze several case examples to explore the utility of critical population-informed thinking for research, knowledge, and policy involving population health and health inequities.

Four propositions emerge: (1) the meaningfulness of means depends on how meaningfully the populations are defined in relation to the inherent intrinsic and extrinsic dynamic generative relationships by which they are constituted; (2) structured chance drives population distributions of health and entails conceptualizing health and disease, including biomarkers, as embodied phenotype and health inequities as historically contingent; (3) persons included in population health research are study participants, and the casual equation of this term with “study population” should be avoided; and (4) the conventional cleavage of “internal validity” and “generalizability” is misleading, since a meaningful choice of study participants must be in relation to the range of exposures experienced (or not) in the real-world societies, that is, meaningful populations, of which they are a part.

Conclusions

To improve conceptual clarity, causal inference, and action to promote health equity, population sciences need to expand and deepen their theorizing about who and what makes populations and their means.

Population sciences, whether focused on people or the plenitude of other species with which we inhabit this world, rely on a remarkable, almost alchemical, feat that nevertheless now passes as commonplace: creating causal and actionable knowledge via the transmutation of data from unique individuals into population distributions, dynamics, and rates. In the case of public health, a comparison of population data—especially rates and averages of traits—sets the basis for not only elucidating etiology but also identifying and addressing health, health care, and health policy inequities manifested in differential outcomes caused by social injustice ( Davis and Rowland 1983 ; Irwin et al. 2006 ; Krieger 2001 , 2011 ; Svensson 1990 ; Whitehead 1992 ; WHO 2008 , 2011 ).

But who are these “populations,” and why should their means be meaningful? Might some instead be meaningless, the equivalent of fool's gold or, worse, dangerously misleading?

Because “population” is such a fundamental term for so many sciences that analyze population data—for example, epidemiology, demography, sociology, ecology, and population biology and population genetics, not to mention statistics and biostatistics (see, e.g., Desrosières 1998 ; Gaziano 2010 ; Greenhalgh 1996 ; Hey 2011 ; Kunitz 2007 ; Mayr 1988 ; Pearce 1999 ; Porter 1986 ; Ramsden 2002 ; Stigler 1986 ; Weiss and Long 2009 )—presumably it would be reasonable to posit that the meaning of “population” is clear-cut and needs no further discussion.

As I document in this article, the surprise instead is that although the idea of “population” is core to the population sciences, it is rarely defined, especially in sciences dealing with people, except in abstract statistical terms. Granted, the “fuzziness” of concepts sometimes can be useful, especially when their empirical content is still being worked out, as illustrated by the well-documented contested history of the meanings of the “gene” as variously an abstract, functional, or physical entity, extending from before and still continuing well after the mid-twentieth-century discovery of DNA ( Burian and Zallen 2009 ; Falk 2000 ; Keller 2000 ; Morange 2001 ). Nevertheless, such fuzziness can also be a major problem, especially if the lack of clear definition or a conflation of meanings distorts causal analysis and accountability.

In this article, I accordingly call for expanding and deepening what I term “critical population-informed thinking.” Such thinking is needed to reckon with, among other things, claims of “population-based” evidence, principles for comparing results across “populations” (and their “subpopulations”), terminology regarding “study participants” (vs. “study population”), and assessing the validity (and not just the generalizability) of results. Addressing these issues requires clearly differentiating between (1) the dominant view that populations are (statistical) entities composed of component parts defined by innate attributes and (2) the alternative that I describe, in which populations are dynamic beings constituted by intrinsic relationships both among their members and with the other populations that together produce their existence and make meaningful casual inference possible.

To make my case, I review current conventional definitions of, and historical debates over, the meaning(s) of “population” and then offer case examples involving population health and health inequities. Informing my argument is the ecosocial theory of disease distribution and its focus on how people literally biologically embody their societal and ecological context, at multiple levels, across the life course and historical generations ( Krieger 1994 , 2001 , 2011 ), thereby producing population patterns of health, disease, and well-being.

Who and What Is a Population?

Conventional definitions.

Who and what determines who and what counts as a “population”? Table 1 lists conventional definitions culled from several contemporary scholarly reference texts. As quickly becomes apparent, the meaning of this term has expanded over time to embrace a variety of concepts. Tracing its etymology to the word's Latin roots, the Oxford English Dictionary ( OED 2010 ), for example, notes that “population” originally referred to the people living in (i.e., populating) a particular place, and this remains its primary meaning. Even so, as the OED 's definitions also make clear, “population” has come to acquire a technical meaning. In statistics, it refers to “a (real or hypothetical) totality of objects or individuals under consideration, of which the statistical attributes may be estimated by the study of a sample or samples drawn from it.” In genetics (or, really, biology more broadly), the OED defines “population” as “a group of animals, plants, or humans, within which breeding occurs.” Likewise, atoms, subatomic particles, stars, and other “celestial objects” are stated as sharing certain properties allowing them to be classed together in “populations” (even though the study of inanimate objects typically falls outside the purview of the “population sciences”).

Definitions of “Population” from Scholarly Reference Texts

Mirroring the OED 's definitions are those provided in diverse “population sciences” dictionaries and encyclopedias. Four such texts, whose definitions are echoed in key works in population health ( Evans, Barer, and Marmor 1994 ; Rose 1992 , 2008 ; Rothman, Greenland, and Lash 2008 ; Young 2005 ), are worth noting: A Dictionary of Epidemiology ( Porta 2008 ), A Dictionary of Sociology ( Scott and Marshall 2005 ), and the two entries from the International Encyclopedia of the Social & Behavioral Sciences that offer a definition of “population,” one focused on “human evolutionary genetics” ( Mountain 2001 ) and the other on “generalization: conceptions in the social sciences” ( Cook 2001 ). A fifth resource, the Encyclopedia of Life Sciences , interestingly does not include any articles specifically on defining “population.” However, of the 396 entries located with the search term “population” and sorted by “relevance,” the first 25 focus on populations principally in relation to genetics, reproduction, and natural selection ( Clarke et al. 2000 –2011).

Among these four texts, all germane to population sciences that study people, the first two briefly define “population” in relation to inhabitants of an area but notably remain mum on the myriad populations appearing in the public health literature not linked to geographic locale (e.g., the “elderly population,” the “white population,” or the “lesbian/gay/bisexual/transgender population”). Most of their text is instead devoted to the idea of “population” in relation to statistical sampling ( Porta 2008 ; Scott and Marshall 2005 ). By contrast, the third text invokes biology (with no mention of statistics) and defines a “population” to be a “mating pool” ( Mountain 2001 , 6985), albeit observing that “groups of humans rarely, if ever, meet this definition,” so that “in practice … human evolutionary geneticists delineate populations along linguistic, geographic, socio-political, and/or cultural boundaries. A population might include, for example, all speakers of a particular Bantu language, all inhabitants of a river valley in Italy, or all members of a caste group in India.”

The fourth text avers that in the social sciences, “population” has two meanings: as a theory-dependent hypothetical “construct” (whose basis is not defined) and as an empirically defined “universe” (used as a sampling frame) ( Cook 2001 ). A telling example illustrates that for people, geographical location, nationality, and ancestry need not neatly match, as in the case of an illegal immigrant or a legal citizen of one country legally residing in a different country ( table 1 ). Consequently, apart from specifying that entities comprising a population individually possess some attribute qualifying them to be a member of that population, none of the conventional definitions offers systematic criteria by which to decide, in theoretical or practical terms, who and what is a population, let alone whether and, if so, why their mean value or rate (or any statistical parameter) might have any substantive meaning.

Meet the “Average Man”: Quetelet's 1830s Astronomical Metaphor Amalgamating “Population” and “Statistics”

The overarching emphasis on “populations” as technical statistical entities and the limited discussion as to what defines them, especially for the human populations, is at once remarkable and unsurprising. It is remarkable because “population” stands at the core, conceptually and empirically, of any and all population sciences. It is unsurprising, given the history and politics of how, in the case of people, “population” and “sample” first were joined ( Krieger 2011 ).

In brief, and as recounted by numerous historians of statistics ( Daston 1987 ; Desrosières 1998 ; Hacking 1975 , 1990 ; Porter 1981 , 1986 , 1995 , 2002 , 2003 ; Stigler 1986 , 2002 ; Yeo 2003 ), during the early 1800s the application of quantitative methods and laws of probability to the study of people in Europe took off, a feat that required reckoning with such profound issues as free will, God's will, and human fate. To express the mind shift involved, a particularly powerful metaphor took root: that of the “l’homme moyen” (the average man), which, in the convention of the day, included women ( figure 1 ). First used in 1831 in an address given by Adolphe Quetelet (1796–1874), the Belgian astronomer-turned-statistician-turned-sociologist-turned-nosologist ( Hankins 1968 ; Stigler 2002 ), the metaphor gained prominence following the publication in 1835 of Quetelet's enormously influential opus, Sur l’homme et le development de ses facultés, ou essai de physique sociale ( Quetelet 1835 ). Melding the ideas of essential types, external influences, and random errors, the image of the “average man” solidified a view of populations, particularly human populations, as innately defined by their intrinsic qualities. Revealing these innate qualities, according to Quetelet, was a population's on-average traits, whether pertaining to height and weight, birth and death rates, intellectual faculties, moral properties, and even propensity to commit crime ( Quetelet 1835 , 1844 ).

An external file that holds a picture, illustration, etc.
Object name is milq0090-0634-f1.jpg

What is the meaning of means and errors?—Adolphe Quetelet (1796–1874) and the astronomical metaphor animating his 1830s “I'homme moyen” (“the average man”).

Source: Illustration of normal curve from Quetelet 1844 .

The metaphor animating Quetelet's “average man” was inspired by his background in astronomy and meteorology. Shifting his gaze from the heavens to the earth, Quetelet arrived at his idea of “the average man” by inverting the standard approach his colleagues used to fix the location of stars, in which the results of observations from multiple observatories (each with some degree of error) were combined to determine a star's most likely celestial coordinates ( Porter 1981 ; Stigler 1986 , 2002 ). Reasoning by analogy, Quetelet ingeniously, if erroneously, argued that the distribution of a population's characteristics served as a guide to its true (inherent) value ( Quetelet 1835 , 1844 ). From this standpoint, the observed “deviations” or “errors” arose from the imperfect variations of individuals, each counting as an “observation-with-error” akin to the data produced by each observatory. The impact of these “errors” was effectively washed out by the law of large numbers. Attesting to the power of metaphor in science and more generally ( Krieger 1994 , 2011 ; Martin and Harré 1982 ; Ziman 2000 ), Quetelet's astronomical “average man” simultaneously enabled a new way to see and study population variation even as it erased a crucial distinction. For a star, the location of the mean referred to the location of a singular real object, whereas for a population, the location of its population mean depended on how the population was defined.

To Quetelet, this new conception of population meant that population means, based on sufficiently large samples, could be meaningfully compared to determine if the populations’ essential characteristics truly differed. The contingent causal inference was that if the specified populations differed in their means, this would mean that they either differed in their essence (if subject to the same external forces) or else were subject to different external forces (assuming the same internal essence). Reflecting, however, the growing pressure for nascent social scientists to be seen as “objective,” Quetelet's discussion of external forces steered clear of politics. Concretely, this translated to not challenging mainstream religious or economic beliefs, including the increasingly widespread individualistic philosophies then linked to the rapid ascendance of the liberal free-market economy ( Desrosières 1998 ; Hacking 1990 ; Heilbron, Magnusson, and Wittrock 1998 ; Porter 1981 , 1986 , 1995 , 2003 ; Ross 2003 ). For example, although Quetelet conceded that “the laws and principles of religion and morality” could act as “influencing causes” ( Quetelet 1844 , xvii), in his analyses he treated education, occupation, and the propensity to commit crime as individual attributes no different from height and weight. The net result was that a population's essence—crucial to its success or failure—was conceptualized as an intrinsic property of the individuals who comprised the population; the corollary was that population means and rates were a result and an expression of innate individual characteristics.

Or so the argument went. At the time, others were not convinced and contended that Quetelet's means were simply arbitrary arithmetic contrivances resulting from declaring certain groups to be populations ( Cole 2000 ; Desrosières 1998 ; Porter 1981 ; Stigler 1986 , 2002 ). As Quetelet himself acknowledged, the national averages and rates defining a country's “average man” coexisted with substantial regional and local variation. Hence, data for one region of France would yield one mean, and for another region it would be something else. If the two were combined, a third mean would result—and who was to say which, if any, of these means was meaningful, let alone reflective of an intrinsic essence (or, for that matter, external influences)?

Quetelet's tautological answer was to differentiate between what he termed “true means” versus mere “arithmetical averages” ( Porter 1981 ; Quetelet 1844 ). The former could be derived only from “true” populations, whose distribution by definition expressed the “law of errors” (e.g., the normal curve). In such cases, Quetelet argued, the mean reflected the population's true essence. By contrast, any disparate lot of objects measured by a common metric could yield a simple “average” (e.g., average height of books or of buildings), but the meaningless nature of this parameter, that is, its inability to be informative about any innate “essence,” would be revealed by the lack of a normal distribution.

And so the argument continued until the terms were changed in a radically different way by Darwin's theory of evolution, presented in Origin of Species , published in 1859 (Darwin [1859] [ 2004 ]). The central conceptual shift was from “errors” to “variation” ( Eldredge 2005 ; Hey 2011 ; Hodge 2009 ; Mayr 1988 ). This variation, thought to reflect inheritable characteristics passed on from parent to progeny, was in effect a consequence of who survived to reproduce, courtesy of “natural selection.” No longer were species, that is, the evolving biological populations to which these individuals belonged, either arbitrary or constant. Instead, they were produced by reproducing organisms and their broader ecosystem. Far from being either Platonic “ideal types” ( Hey 2011 ; Hodge 2009 ; Mayr 1988 ; Weiss and Long 2009 ), per Quetelet's notion of fixed essence plus error, or artificially assembled aggregates capable of yielding only what Quetelet would term meaningless mere “averages,” “populations” were newly morphed into temporally dynamic and mutable entities arising by biological descent. From this standpoint, variation was vital, and variants that were rare at one point in time could become the new norm at another.

Nevertheless, even though the essence of biological populations was now impermanent, what substantively defined “populations” remained framed as fundamentally endogenous. In the case of biological organisms, this essence resided in whatever material substances were transmitted by biological reproduction. Left intact was an understanding of population, population traits, and their variability as innately defined, with this variation rendered visible through a statistical analysis of appropriate population samples. The enduring result was to (1) collapse the distinctions between populations as substantive beings versus statistical objects and (2) imply that population characteristics reflect and are determined by the intrinsic essence of their component parts. Current conventional definitions of “population” say as much and no more ( table 1 ).

Conceptual Criteria for Defining Meaningful Populations for Public Health

Framing and Contesting “Population” through an Epidemiologic Lens . In the 150 years since these initial features of populations were propounded, they have become deeply entrenched, although not entirely uncontested. Figure 2 is a schematic encapsulation of mid-nineteenth to early twentieth-century notions of populations, with the entries emphasizing population statistics and population genetics because of their enduring influence, even now, on conceptions of populations in epidemiology and other population sciences. During this period, myriad disciplines in the life, social, and physical sciences embraced a statistical understanding of “population” ( Desrosières 1998 ; Hey 2011 ; Porter 1981 , 1986 , 2002 , 2003 ; Ross 2003 ; Schank and Twardy 2009 ; Yeo 2003 ). Eugenic thinking likewise became ascendant, espoused by leading scientists and statisticians, especially the newly named “biometricians,” who held that individuals and populations were determined and defined by their heredity, with the role of the “environment” being negligible or nil ( Carlson 2001 ; Davenport 1911 ; Galton 1904 ; Kevels 1985 ; Mackenzie 1982 ; Porter 2003 ; Tabery 2008 ).

An external file that holds a picture, illustration, etc.
Object name is milq0090-0634-f2.jpg

A schematic cross-disciplinary genealogy of mid-nineteen to early twentieth-century “population” thinking and current impact.

Sources : Carver 2003 ; Crow 1990 , 1994 ; Dale and Katz 2011 ; Darwin 1859 ; Daston 1987 ; Desrosières 1998 ; Eldredge 2005 ; Galton 1889 , 1904 ; Hacking 1975 , 1990 ; Hey 2011 ; Hodge 2009 ; Hogben 1933 ; Keller 2010 ; Mackenzie 1982 ; Marx 1845 ; Mayr 1988 ; Porter 1981 , 1986 , 2002 , 2003 ; Quetelet 1835 , 1844 ; Sarkar 1996 ; Schank and Twardy 2009 ; Stigler 1986 , 1997 ; Tabery 2008 ; Yeo 2003 .

It was also during the early twentieth century that the nascent academic discipline of epidemiology advanced its claims about being a population science, as part of distinguishing both the knowledge it generated and its methods from those used in the clinical and basic sciences ( Krieger 2000 , 2011 ; Lilienfeld 1980 ; Rosen [1958] [ 1993 ]; Susser and Stein 2009 ; Winslow et al. 1952 ). In 1927 and in 1935, for example, the first professors of epidemiology in the United States and the United Kingdom—Wade Hampton Frost (1880–1938) at the Johns Hopkins School of Hygiene and Public Health in 1921 ( Daniel 2004 ; Fee 1987 ), and Major Greenwood (1880–1949) at the London School of Hygiene and Tropical Medicine in 1928 ( Butler 1949 ; Hogben 1950 )—urged that epidemiology clearly define itself as the science of the “mass phenomena” of disease, Frost in his landmark essay “Epidemiology” (Frost [1927] [ 1941 ], 439) and Greenwood in his discipline-defining book Epidemics and Crowd Diseases: An Introduction to the Study of Epidemiology ( Greenwood 1935 , 125). Neither Frost nor Greenwood, however, articulated what constituted a “population,” other than the large numbers required to make a “mass.”

Also during the 1920s and 1930s, two small strands of epidemiologic work—each addressing different aspects of the inherent dual engagement of epidemiology with biological and societal phenomena ( Krieger 1994 , 2001 , 2011 )—began to challenge empirically and conceptually the dominant view of population characteristics as arising solely from individuals' intrinsic properties. The first thread was metaphorically inspired by chemistry's law of “mass action,” referring to the likelihood that two chemicals meeting and interacting in, say, a beaker, would equal the product of their spatial densities ( Heesterbeek 2005 ; Mendelsohn 1998 ). Applied to epidemiology, the law of “mass action” spurred novel efforts to model infectious disease dynamics arising from interactions between what were termed the “host” and the “microbial” populations, taking into account changes in the host's characteristics (e.g., from susceptible to either immune or dead) and also the population size, density, and migration patterns (Frost [ 1928 ] 1976; Heesterbeek 2005 ; Hogben 1950 ; Kermack and McKendrick 1927 ; Mendelsohn 1998 ).

The second thread was articulated in debates concerning eugenics and also in response to the social crises and economic depression precipitated by the 1929 stock market crash. Its focus concerned how societal conditions could drive disease rates, not only by changing individuals’ economic position, but also through competing interests. Explicitly stating this latter point was the 1933 monograph Health and Environment ( Sydenstricker 1933 ), prepared for the U.S. President's Research Committee on Social Trends by Edgar Sydenstricker (1881–1936), a leading health researcher and the first statistician to serve in the U.S. Public Health Service ( Krieger 2011 ; Krieger and Fee 1996 ; Wiehl 1974 ). In this landmark text, which explicitly delineated diverse aspects of what he termed the “social environment” alongside the physical environment, Sydenstricker argued (1933, 16, italics in original):

Economic factors in the conservation or waste of health, for example, are not merely the rate of wages; the hours of labor; the hazard of accident, of poisonous substances, or of deleterious dusts; they include also the attitude consciously taken with respect to the question of the relative importance of large capitalistic profits versus maintenance of the workers’ welfare.

In other words, social relations, not just individual traits, shape population distributions of health.

Influenced by and building on both Greenwood's and Sydenstricker's work, in 1957 Jeremy Morris (1910–2009) published his highly influential and pathbreaking book Uses of Epidemiology ( Morris 1957 ), which remains a classic to this day ( Davey Smith and Morris 2004 ; Krieger 2007a ; Smith 2001 ). Going beyond Frost and Greenwood, Morris emphasized that “the unit of study in epidemiology is the population or group , not the individual ” ( Morris 1957 , 3, italics in original) and also went further by newly defining epidemiology in relational terms, as “ the study of health and disease of populations and of groups in relation to their environment and ways of living ” ( Morris 1957 , 16, italics in original). As a step toward defining “population,” Morris noted that “the ‘population’ may be of a whole country or any particular and defined sector of it” ( Morris 1957 , 3), as delimited by people's “environment, their living conditions, and special ways of life” ( Morris 1957 , 61). He also, however, recognized that better theorizing about populations was needed and hence called for a greater “understanding of the properties of individuals which they have in virtue of their group membership” ( Morris 1957 , 120, italics in original). But this appeal went largely unheeded, as it directly contradicted the era's prevailing framework of methodological individualism ( Issac 2007 ; Krieger 2011 ; Ross 2003 ).

Morris's insights notwithstanding, the dominant view has remained what is presented in table 1 . Even the recent influential work of Geoffrey Rose (1926–1993), crucial to reframing individual risk in population terms, theorized populations primarily in relation to their distributional, not substantive, properties ( Rose 1985 , 1992 , 2008 ). Rose's illuminating analyses thus emphasized that (1) within a population, most cases arise from the proportionately greater number of persons at relatively low risk, as opposed to the much smaller number of persons at high risk; (2) determinants of risk within populations may not be the same as determinants of risk between populations; and (3) population norms shape where both the tails and the mean of a distribution occur. Rose thus cogently clarified that to change populations is to change individuals, and vice versa, implying that the two are mutually constitutive, but he left unspecified who and what makes meaningful populations and when they can be meaningfully compared.

Current Challenges to Conventional Views of “Population.” A new wave of work contesting the still reigning idea of “the average man” can currently be found in recent theoretical and empirical work in the social and biological sciences attempting to analyze population phenomena in relation to dynamic causal processes that encompass multiple levels and scales, from macro to micro ( Biersack and Greenberg 2006 ; Eldredge 1999 ; Eldredge and Grene 1992 ; Gilbert and Epel 2009 ; Grene and Depew 2004 ; Harraway 2008 ; Illari, Russo, and Williamson 2011 ; Krieger 2011 ; Lewontin 2000 ; Turner 2005 ). Also germane is research on system properties in the physical and information sciences ( Kuhlmann 2011 ; Mitchell 2009 ; Strevens 2003 ).

Applicable to the question of who and what makes a population, one major focus of this alternative thinking is on processes that generate, maintain, transform, and lead to the demise of complex entities. This perspective builds on and extends a long history of critiques of reductionism ( Grene and Depew 2004 ; Harré 2001 ; Illari, Russo, and Williamson 2011 ; Lewontin 2000 ; Turner 2005 ; Ziman 2000 ), which together aver that properties of a complex “whole” cannot be reduced to, and explained solely by, the properties of its component “parts.” The basic two-part argument is that (a) new (emergent) properties can arise out of the interaction of the “parts” and (b) properties of the “whole” can transform the properties of their parts. Thus, to use one well-known example, a brain can think in ways that a neuron cannot. Taking this further in regard to the generative causal processes at play, what a brain thinks can affect neuron connections within the brain, and it also is affected by the ecological context and experiences of the organism, of which the brain is a part ( Fox, Levitt, and Nelson 2010 ; Gibson 1986 ; Harré 2001 ; Stanley, Phelps, and Banaji 2008 ). The larger claim is that the causal processes that give rise to complex entities can both structure and transform the characteristics of both the whole and its parts.

What might it look like for public health to bring this alternative perspective to the question of defining, substantively, who and what makes a population? Let me start with a conceptual answer, followed by some concrete public health propositions and examples.

Populations as Relational Beings: An Alternative Causal Conceptualization

In brief, I argue that a working definition of “populations” for public health (or any field concerned with living organisms) would, in line with Sydenstricker (1933) and Morris (1957) and the other contemporary theorists just cited, stipulate that populations are first and foremost relational beings, not “things.” They are active agents, not simply statistical aggregates characterized by distributions.

Specifically, as tables 2 and ​ and3 3 show, the substantive populations that populate our planet

Conceptual Criteria for Defining Meaningful Populations for Population Sciences, Guided by the Ecosocial Theory of Disease Distribution

Source: Krieger 1994 , 2001 , and 2011 , 214–15.

Defining Features of Populations of Living Beings, Including Humans, Relevant to Public Health and Population Sciences

  • Are animate, self-replicating, and bounded complex entities, generated by systemic causal processes.
  • Arise from and are constituted by relationships of varying strengths, both externally (with and as bounded by other populations) and internally (among their component beings).
  • Are inherently constituted by, and simultaneously influence the characteristics of, the varied individuals who comprise its members and their population-defined and -defining relationships.

It is these relationships and their underlying causal processes (both deterministic and probabilistic), not simply random samples derived from large numbers, that make it possible to make meaningful substantive and statistical inferences about population characteristics, as well as meaningful causal inferences about observed associations.

Accordingly, as summarized by Richard A. Richards, a philosopher of biology (who was writing about species, one type of population), populations have “well-defined beginnings and endings, and cohesion and causal integration” ( Richards 2001 ). They likewise necessarily exhibit historically contingent distributions in time and space, by virtue of the dynamic interactions intrinsically occurring between (and within) their unique individuals and with other equally dynamic codefining populations and also their changing abiotic environs. Underscoring this point, even a population of organisms cloned from a single source organism will exhibit variation and distributions as illustrated by the phenomenon of developmental “noise,” an idea presaged by early twentieth-century observations of chance differences in coat color among litter mates of pure-bred populations raised in identical circumstances ( Davey Smith 2011 ; Lewontin 2000 ; Wright 1920 ).

As for the inherent relationships characterizing populations, both internally and externally, I suggest that four key types stand out, as informed by the ecosocial theory of disease distribution ( Krieger 1994 , 2001 , 2011 ); the collaborative writing of Niles Eldredge, an evolutionary biologist, and Marjorie Grene, a philosopher of biology ( Eldredge and Grene 1992 ); as well as works from political sociology, political ecology, and political geography ( Biersack and Greenberg 2006 ; Harvey 1996 ; Nash and Scott 2001 ). As tables 2 and ​ and3 3 summarize, these four kinds of relationships are (1) genealogical , that is, relationships by biological descent; (2) internal and economical, in the original sense of the term, referring to relationships essential to the daily activities of whatever is involved in maintaining life (in ancient Greece, oikos , the root of the “eco” in both “ecology” and “economics,” referred to a “household,” conceptualized in relation to the activities and interactions required for its existence [ OED 2010]); (3) external and ecological , referring to relationships between populations and with the environs they coinhabit; and (4) in the case of people (and likely other species as well), teleological , that is, by design, with some conscious purpose in mind (e.g., citizenship criteria). Spanning from mutually beneficial (e.g., symbiotic) to exploitative (benefiting one population at the expense of the other), these relationships together causally shape the characteristics of populations and their members.

What are some concrete examples of animate populations that exemplify these points? Table 3 provides four examples. Two pertain to human populations: the “U.S. population” ( Foner 1997 ; Zinn 2003 ) and “social classes” ( Giddens and Held 1982 ; Wright 2005 ). The third considers microbial populations within humans ( Dominguez-Bello and Blaser 2011 ; Pflughoeft and Versalovic 2012 ; Walter and Ley 2011 ), and the fourth concerns a plant population, a species of tree, the poplar, whose genus name ( Populus ) derives from the same Latin root as “population” ( Braatne, Rood, and Heillman 1996 ; Fergus 2005 ; Frost et al. 2007 ; Jansson and Douglas 2007 ). Together, these examples clarify what binds—as well as distinguishes—each of these dynamic populations and their component individuals. They likewise underscore that contrary to common usage, “population” and “individual” are not antonyms. Instead, they hark back to the original meaning of “individual”—that is, “individuum,” or what is indivisible, referring to the smallest unit that retained the properties of the whole to which it intrinsically belonged ( OED 2010; Williams 1985 ). Thus, although it is analytically possible to distinguish between “populations” and “individuals,” in reality these phenomena occur and are lived simultaneously. A person is not an individual on one day and a member of a population on another. Rather, we are both, simultaneously. This joint fact is fundamental and is essential to keep in mind if analysis of either individual or population phenomena is to be valid.

The importance of considering the intrinsic relationships—both internal and external—that are the integuments of living populations, themselves active agents and composed of active agents, is further illuminated through contrast to the classic case of a hypothetical population: the proverbial jar of variously colored marbles, used in many classes to illustrate the principles of probability and sampling. Apart from having been manufactured to be of a specific size, density, and color, there are no intrinsic relationships between the marbles as such. Spill such a jar, and see what happens.

As this thought experiment makes clear, the marbles will not reconstitute themselves into any meaningful relationships in space or time. They will just roll to wherever they do, and that will be the end of it, unless someone with both energy and a plan scoops them up and puts them back in the jar. Nor will a sealed jar of marbles change its color composition (i.e., the proportion of marbles of a certain color), or an individual marble change its color, unless someone opens the jar and replaces, adds, or removes some marbles or treats them with a color-changing agent. Hence, a purely statistical understanding of “populations,” however necessary for sharpening ideas about causal inference, study design, and empirical estimation, is by itself insufficient for defining and analyzing real-life populations, including “population health.”

That said, marbles do have their uses. In particular, they can help us visualize how causal determinants can structure population distributions of the risks of random individuals via what I term “structured chances.”

Populations and Structured Chances

One long-standing conundrum in population sciences is their ability to identify and use data on population regularities to elucidate causal pathways, even though they cannot predict which individuals in the population will experience the outcome in question ( Daston 1987 ; Desrosières 1998 ; Hacking 1990 ; Illari, Russo, and Williamson 2011 ; Porter 1981 , 2002 , 2003 ; Quetelet 1835 ; Stigler 1986 ; Strevens 2003 ). This incommensurability of population and individual data has been a persistent source of tension between epidemiology and medicine (Frost [1927] [ 1941 ]; Greenwood 1935 ; Morris 1957 ; Rose 1992 , 2008 ). Epidemiologic research, for example, routinely uses aggregated data obtained from individuals to gain insight into both disease etiology and why population rates vary, and does so with the understanding that such research cannot predict which individual will get the disease in question ( Coggon and Martyn 2005 ). By contrast, medical research remains bent on using just these sorts of data to predict an individual's risk, as exemplified in its increasingly molecularized quest for “personalized medicine” ( Davey Smith 2011 ).

Where marbles enter the picture is that they can, through the use of a physical model, demonstrate the importance of how population distributions are simultaneously shaped by both structure (arising from causal processes) and randomness (including truly stochastic events, not just “randomness” as a stand-in for “ignorance” of myriad deterministic events too complex to model). As Stigler has recounted (1997), perhaps the first person to propose using physical models to understand probability was Sir Francis Galton (1822–1911), a highly influential British scientist and eugenicist ( figure 2 ), who himself coined the term “eugenics” and who held that heredity fundamentally trumped “environment” for traits influencing the capacity to thrive, whether physical, like health status, or mental, like “intelligence” ( Carlson 2001 ; Cowan 2004 ; Galton 1889 , 1904 ; Keller 2010 ; Kevels 1985 ; Stigler 1997 ). In his 1889 opus Natural Inheritance ( Galton 1889 ), Galton sketched ( figure 3 ) “an apparatus … that mimics in a very pretty way the conditions on which Deviation depends” ( Galton 1889 , 63), whereby gun shots (i.e., marble equivalents) would be poured through a funnel down a board whose surface was studded with carefully placed pins, off which each pellet would ricochet, to be collected in evenly spaced bins at the bottom.

An external file that holds a picture, illustration, etc.
Object name is milq0090-0634-f3.jpg

Producing population distributions: structured chances as represented by physical models.

Sources: Galton's Quincunx, Galton 1889 , 63; physical models, Limpert, Stahel, and Abbt 2001 (reproduced with permission).

Galton termed his apparatus, which he apparently never built ( Stigler 1997 ), the “Quincunx” because the pattern of the pins used to deflect the shot was like a tree-planting arrangement of that name, which at the time was popular among the English aristocracy ( Stigler 1997 ). The essential point was that although each presumably identical ball had the same starting point, depending on the chance interplay of which pins it hit during its descent at which angle, it would end up in one or another bin. The accumulation of balls in any bin in turn would reflect the number of possible pathways (i.e., likelihood) leading to its ending up in that bin. Galton designed the pin pattern to yield a normal distribution. He concluded that his device revealed ( Galton 1889 , 66)

a wonderful form of cosmic order expressed by the “Law of Frequency of Error.” The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy the more perfect is its sway … each element, as it is sorted into place, finds, as it were, a pre-ordained niche, accurately adapted to fit it.

In other words, in accord with Quetelet's view of “l’homme moyen,” Galton saw the order produced as the property of each “element,” in this case, the gun shot.

However, a little more than a century later, some physicists not only built Galton's “Quincunx,” as others have done ( Stigler 1997 ), but went one further ( Limpert, Stahel, and Abbt 2001 ): they built two, one designed to generate the normal distribution and the other to generate the log normal distribution (a type of distribution skewed on the normal scale, but for which the natural logarithm of the values displays a normal distribution) ( figure 3 ). As their devices clearly show, what structures the distribution is not the innate qualities of the “elements” themselves but the features of both the funnel and the pins—both their shape and placement. Together, these structural features determine which pellets can (or cannot) pass through the pins and, for those that do, their possible pathways.

The lesson is clear: altering the structure can change outcome probabilities, even for identical objects, thereby creating different population distributions. For the population sciences, this insight permits understanding how there can simultaneously be both chance variation within populations (individual risk) and patterned differences between population distributions (rates). Such an understanding of “structured chances” rejects explanations of population difference premised solely on determinism or chance and also brings Quetelet's astronomical “l’homme moyen” and its celestial certainties of fixed stars back down to earth, grounding the study of populations instead in real-life, historically contingent causal processes, including those structured by human agency.

Rethinking the Meaning and Making of Means: The Utility of Critical Population-Informed Thinking

How might a more critical understanding of the substantive nature of real-life populations benefit research on, knowledge about, and policies regarding population health and health inequities? Drawing on table 2 's conceptual criteria for defining who and what makes populations, table 4 offers four sets of critical public health propositions about “populations” and “study populations,” whose salience I assess using examples of breast cancer, a disease increasingly recognized as a major cause of morbidity and mortality in both the global South and the global North ( Althuis et al. 2005 ; Bray, McCarron, and Parkin 2004 ; Parkin and Fernández 2006 ) and one readily revealing that the problem of meaningful means is as vexing for “the average woman” as for “the average man.”

Four Propositions to Improve Population Health Research, Premised on Critical Population-Informed Thinking

Propositions 1 and 2: Critically Parsing Population Rates and Their Comparisons

Consider, first, three illustrative cases pertaining to analyses of population rates of breast cancer:

  • A recent high-profile analysis of the global burden of breast cancer ( Briggs 2011 ; Forouzanafar et al. 2011 ; IHME 2011; Jaslow 2011 ), which estimated and compared rates across countries, accompanied by interpretative text, with the article stating, for example, that Colombia and Venezuela “… have very different trends, despite sharing many of the same lifestyle and demographic factors,” followed by the inference that the “explanation of these divergent trends may lie in the interaction between genes and individual risk factors.” (IHME 2011, 24)
  • Typical reviews of the global epidemiology of breast cancer, which contain such statements as “Population-based statistics show that globally, when compared to whites, women of African ancestry (AA) tend to have more aggressive breast cancers that present more frequently as estrogen receptor negative (ERneg) tumors” ( Dunn et al. 2010 , 281); and “early onset ER negative tumors also develop more frequently in Asian Indian and Pakistani women and in women from other parts of Asia, although not as prevalent as it is in West Africa.” ( Wallace, Martin, and Ambs 2011 , 1113)
  • The headline-making news that the U.S. breast cancer incidence rate in 2003 unexpectedly dropped by 10 percent, a huge decrease ( Kolata 2006 , 2007 ; Ravdin et al. 2006 , 2007 ).

What these three commonplace examples have in common is an uncritical approach to presenting and interpreting population data, premised on the dominant assumption that population rates are statistical phenomena driven by innate individual characteristics. Cautioning against accepting these claims at face value are propositions 1 and 2, with their emphases, respectively, on (1) critically appraising who constitutes the populations whose means are at issue and (2) critically considering the dynamic relationships that give rise to population patterns of health, including health inequities.

From the standpoint of proposition 1, the first relevant fact is that as a consequence of global disparities in resources ( Klassen and Smith 2011 ) arising from complex histories of colonialism and underdevelopment ( Birn, Pillay, and Holtz 2009 ), only 16 percent of the world's population is covered by cancer registries, with coverage of less than 10 percent within the world's most populous regions (Africa, Asia [other than Japan], Latin America, and the Caribbean), versus 99 percent in North America ( Parkin and Fernández 2006 ). Put in national terms, among the 184 countries for which the International Agency on Cancer (IARC) reports estimated rates, only 33 percent—almost all located in the global North—have reliable national incidence data ( GLOBOCAN 2012 ). These data limitations are candidly acknowledged both by IARC ( GLOBOCAN 2012 ) and in the scientific literature, including that on breast cancer ( Althuis et al. 2005 ; Bray, McCarron, and Parkin 2004 ; Ferlay et al. 2012 ; Krieger, Bassett, and Gomez 2012 ; Parkin and Fernández 2006 ). To generate estimates of incidence in countries lacking national cancer registry data, the IARC transparently employs several modeling approaches, based on, for example, a country's national mortality data combined with city-specific or regional cancer registry data (if they do exist, albeit typically not including the rural poor) or, when no credible national data are available, estimating rates based on data from neighboring countries ( GLOBOCAN 2012 ).

A critical analysis of the population claims asserted in examples 1 and 2 starts by questioning whether the means at issue can bear the weight of meaningful comparisons and inference. Thus, relevant to example 1, Colombia has only one city-based cancer registry (in Cali), and Venezuela has no cancer registries at all ( GLOBOCAN 2012 ). Moreover, the rates compared ( Forouzanafar et al. 2011 ; IHME 2011 ) were generated by nontransparent modeling methods ( Krieger, Bassett, and Gomez 2012 ) that have empirically been shown not to estimate accurately the actually observed rates in the “gold-standard” Nordic countries, known for their excellent cancer registration data ( Ferlay et al. 2012 ). Second, relevant to the countries and geographic regions listed in example 2, the cancer incidence rates estimated by IARC are based (a) for Pakistan, solely on the weighted average for observed rates in south Karachi, (b) for India, on a complex estimation scheme for urban and rural rates in different Indian states and data from cancer registries in several cities, and (c) for western Africa, on the weighted average of data for sixteen countries, of which ten have incidence rates estimated based on those of neighboring countries, another five rely on data extrapolated from cancer registry data from one city (or else city-based cancer registries in neighboring countries), and only one of which has a national cancer registry ( GLOBOCAN 2012 ). Critical thinking about who and what makes a population thus prompts questions about whether the data presented in examples 1 and 2 can provide insight into either alleged individual innate characteristics or into what the true on-average rate would be if everyone were counted (let alone what the variability in rates might be across social groups and regions). There is nothing mundane about a mean.

Proposition 2 in turn calls attention to structured chance in relation to the dynamic intrinsic and extrinsic relationships constituting national populations, with table 2 illustrating what types of relationships are at play using the example of the United States. It thus spurs critical queries as to whether observed national and racial/ethnic differences (if real, and not an artifact of inaccurate data) arise from innate (i.e., genetic) differences between “populations,” as posed by examples 1 and 2. Two lines of evidence alternatively suggest these population differences could instead be embodied inequalities ( Krieger 1994 , 2000 , 2005 , 2011 ; Krieger and Davey Smith 2004 ) that arise from structured chances. The first line pertains to well-documented links among national, racial/ethnic, and socioeconomic inequalities in breast cancer incidence, survival, and mortality ( Klassen and Smith 2011 ; Krieger 2002 ; Vona-Davis and Rose 2009 ). The second line stems from research that evaluates claims of intrinsic biological difference by examining their dynamics, as illustrated by the first investigation to test statistically for temporal trends in the white/black odds ratio for ER positive breast cancer between 1992 and 2005, which revealed that in the United States, the age-adjusted odds ratio rose between 1992 and 2002 and then leveled off (and actually fell among women aged fifty to sixty-nine) ( Krieger, Chen, and Waterman 2011 ).

Relevant to example 3, these findings of dynamic, not fixed, black/white risk differences for breast cancer ER status likely reflect the socially patterned abrupt decline in hormone therapy use following the July 2002 release of results from the U.S. Women's Health Initiative (WHI) ( Rossouw et al. 2002 ). This was the first large randomized clinical trial of hormone therapy, despite its having been widely prescribed since the mid-1960s ( Krieger 2008 ). The WHI found that contrary to what was expected, hormone therapy did not decrease (and may have raised) the risk of cardiovascular disease, and at the same time, the WHI confirmed prior evidence that long-term use of hormone therapy increased the risk of breast cancer (especially ER+). Thus, before the initiative, hormone therapy use in the United States was highest among white women with health insurance who could afford, and were healthy enough, to take the medication without any contraindications ( Brett and Madans 1997 ; Friedman-Koss et al. 2002 ). Population-informed thinking would thus predict that any drops in breast cancer incidence would occur chiefly among those sectors of women most likely to have used hormone therapy. Subsequent global research has borne out these predictions ( Zbuk and Anand 2012 ), including the sole U.S. study that systematically explored socioeconomic differentials both within and across racial/ethnic groups, which found that the observed breast cancer decline was restricted to white non-Hispanic women with ER+ tumors residing in more affluent counties ( Krieger, Chen, and Waterman 2010 ). These results counter the widely disseminated and falsely reassuring impression that breast cancer risk was declining for everyone ( Kolata 2006 , 2007 ). They accordingly provide better guidance to public health agencies, clinical providers, and breast cancer advocacy groups regarding trends in breast cancer occurrence among the real-life populations they serve.

Together, these examples illuminate why proposition 2's corollary 2.2 proposes conceptualizing the jointly lived experience of population rates and individual manifestations of health, disease, and well-being as what I would term “embodied phenotype.” Inherently dynamic and relational, this proposed construct meaningfully links the macro and micro, and populations and individuals, through the play of structured chance. It also is consonant with new insights emerging from the fast-growing field of ecological evolutionary developmental biology (“eco-evo-devo”) into the profound and dynamic links among environmental exposures, gene expression, development, speciation, and the flexibility of organisms’ phenotypes across the life span ( Gilbert and Epel 2009 ; Piermsa and van Gils 2011 ; West-Eberhard 2003 ). Only just beginning to be integrated into epidemiologic theorizing and research ( Bateson and Gluckman 2012 ; Davey Smith 2011 , 2012 ; Gilbert and Epel 2009 ; Kuzawa 2012 ; Relton and Davey Smith 2012 ), eco-evo-devo's historical and relational approach to biological expression affirms the need for critical population-informed thinking.

Propositions 3 and 4: Study Participants, Study Populations, and Causal Inference

Finally, a population-informed approach helps clarify, in accordance with propositions 3 and 4, why improving our understanding of “study populations,” and thus study participants, matters for causal inference. Consider, for example, the 1926 pathbreaking epidemiologic study of breast cancer conducted by the British physician and epidemiologist Janet Elizabeth Lane-Claypon (1877–1967) ( Lane-Claypon 1926 ), the first study to identify systematically what were then called “antecedents” of breast cancer (today termed “risk factors”) and now also widely acknowledged to be the first epidemiologic case-control study, as well as the first epidemiologic study to publish its questionnaire ( Press and Pharoah 2010 ; Winkelstein 2004 ). Quickly replicated in the United States in 1931 by Wainwright ( Wainwright 1931 ), these two studies have recently been reanalyzed, using current statistical methods. The results show that their estimates of risk associated with major reproductive risk factors (e.g., early age at first birth, parity, lactation, and early age at menopause) are consistent with the current evidence ( Press and Pharoah 2010 ).

Not addressed in the reanalysis, however, are the two studies’ different results for occupational class, defined in relation to the women's employment before marriage. When these occupational data are recoded into the meaningful categories of professional, working-class nonmanual, and working-class manual ( Krieger, Williams, and Moss 1997 ; Rose and Pevalin 2003 ), the data quickly reveal why the studies had discrepant results. Thus, Lane-Claypon concluded there was no “appreciable difference” in breast cancer risk by social class ( Lane-Claypon 1926 , 12) (χ 2 = 1.833; p = 0.4), whereas in the U.S. study risk was lower among the working-class manual women (χ 2 = 9.305; p = 0.01). Why? In brief, a far higher proportion of the British women were working-class manual (78.7% cases, 84.2% controls vs. the U.S. women: 48.8% cases, 62.5% controls), and a far lower proportion were professionals (6.5% cases, 4.2% controls, vs. the U.S. women: 23.8% cases, 20.7% controls). Just as Rose famously observed that if everyone smoked, smoking would not be identified as a cause of lung cancer ( Rose 1985 , 1992 ), when most study participants are from only one social class, socioeconomic inequalities in health cannot and will not be detected ( Krieger 2007b ). The net result is erroneous causal inferences about the relevance of social class to structuring the risk of disease, thereby distorting the evidence base informing efforts to address health inequities.

Critical population-informed thinking therefore would question the dominant conventional cleavage, in both the population health and the social sciences, between “internal validity” and “generalizability” (or “external validity”) and the related endemic language of “study population”—routinely casually equated with study participants—and “general population” ( Broadbent 2011 ; Cartwright 2011 ; Cook 2001 ; Kincaid 2011 ; Kukuall and Ganguli 2012 ; Porta 2008 ; Rothman, Greenland, and Lash 2008 ). One critical determinant of a study's ability to provide valid tests of exposure-outcome hypotheses is the range of exposure encompassed ( Chen and Rossi 1987 ; Schlesselman and Stadel 1987 ); another is the extent to which participants’ selection into a study is associated with important unmeasured determinants of the outcome ( Pizzi et al. 2011 ). Given the social structuring of the vast majority of exposures, as evidenced by the virtually ubiquitous and dynamic societal patternings of disease ( Birn, Pillay, and Holtz 2009 ; Davey Smith 2003 ; Krieger 1994 , 2011 ; WHO 2008), meaningful research requires that the range of exposures experienced (or not) by study participants needs to capture the etiologically relevant range experienced in the real-world societies, that is, meaningful populations, of which they are a part. The point is not that ideal study participants should be a random sample of some “general population”; instead, it is that their location in the intrinsic and extrinsic relationships creating their population membership cannot be ignored.

Highlighting the need for critical population-informed thinking is advice provided in the widely used and highly influential textbook Modern Epidemiology ( Rothman, Greenland, and Lash 2008 ). Although the text correctly states that “the pursuit of representativeness can defeat the goal of validly identifying causal relations,” it further asserts that “one would want to select study groups for homogeneity with respect to important confounders, for highly cooperative behavior, and for availability of accurate information, rather than attempt to be representative of a natural population” (p. 146). “Classic examples” of the populations fulfilling these criteria are stated to be “the British Physicians’ Study of smoking and health and the Nurses’ Health Study, neither of which were remotely representative of the general population with respect to sociodemographic factors” ( Rothman, Greenland, and Lash 2008 , 146–47).

Of course, studies need accurate data, but the advice here raises more questions than it answers. First, just who and what is a “natural population”?—and, related, who is that “general population”? Second, might there be drawbacks to, not just benefits from, preferentially studying predominantly white health professionals and others with the resources to be “highly cooperative” and possess “accurate information”? Stated another way, what might be the adverse consequences on scientific knowledge and policymaking of discounting people that mainstream research already routinely and problematically calls “hard-to-reach” populations ( Crosby et al. 2010 ; Shaghaghi, Bhopal, and Sheik 2011 )? These populations include the disempowered and dispossessed, whose adverse social and physical circumstances mean that their range of exposures almost invariably differ, in both level and type, from those encountered by the effectively “easy-to-reach.” Might it not also be critical for researchers to develop more inclusive approaches that could yield accurate etiologic and policy-relevant data on the distributions and determinants of disease among those who bear the brunt of health inequities ( Smylie et al. 2012 )?—a scientific task that necessarily requires contrasts in both exposures and outcomes between the social groups defined by the inequitable societal relationships at issue, whether involving social class, racism, gender, or other forms of social inequality ( Krieger 2007b ).

Reflecting on how who is studied determines what can be learned, the eminent British biologist Lancelot Hogben (1895–1975) ( figure 2 ; Bud 2004 ; Werskey 1988 ), in his lucid and prescient 1933 book titled Nature and Nurture ( Hogben 1933 , 106), cogently observed:

Differences to which members of the same family or different families living at one and the same social level are exposed may be very much less than differences to which individuals belonging to families taken from different social levels are exposed. Experiment shows that ultra-violet light has a considerable influence on growth in mammals. In Great Britain, some families live continuously in the sooty atmosphere of an industrial area. Others spend their winters on the Riviera.

In other words, critical population-informed thinking is vital to good science.

Conclusion: Meaningful Means, Embodied Phenotypes, and the Structural Determinants of Populations and the People's Health

In conclusion, to improve causal inference and policies and action based on this knowledge, the population sciences need to expand and deepen theorizing about who and what makes populations and their means. At a time when the topic of causality in the sciences remains hotly debated by philosophers and researchers alike, all parties nevertheless agree that “the question of how probabilistic accounts of causality can mesh with mechanistic accounts of causality desperately needs answering” ( Illari, Russo, and Williamson 2011 , 20). As my article makes clear, the idea and reality of “population” reside at the nexus of this question. Clarifying the substantive defining features of populations, including who and what structures the dynamic and emergent distributions of their characteristics and components, is thus crucial to both analyzing and altering causal processes. For public health, this means sharpening our thinking about how structured chances, structured by the political and economic relationships constituting the societal determinants of health ( Birn, Pillay, and Holtz 2009 ; Irwin et al. 2006 ; Krieger 1994 , 2011 ), generate the embodied phenotypes that are the people's health.

As should be evident, the challenges to developing critical population-informed thinking are not purely conceptual; they are also political, because these ideas necessarily engage with issues involving not only the distribution of people but also the distribution of power and property and the societal relationships that bind individuals and populations, for good and for bad ( Krieger 2011 ). Nearly two hundred years after Quetelet introduced his “l’homme moyen,” the countervailing call for routinely measuring and tracking population health inequities, and not just on-average population rates of health, is only now gaining traction globally (WHO 2008, 2011). This is coincident with the ever-accelerating aforementioned genomic quest for “personalized medicine” ( Davey Smith 2011 ), as well as the continued economic, social, political, and public health reverberations of the 2008 global economic crash ( Benatar, Gill, and Bakker 2011 ; Stiglitz 2010 ). In such a context, clarity regarding who and what populations are, and the making and meaning of their means, is vital to population sciences, population health, and the promotion of health equity.

Acknowledgments

No funding supported this work.

  • Althuis MD, Dozier JM, Anderson WF, Devesa SS, Brinton LA. Global Trends in Breast Cancer Incidence and Mortality 1973–1999. International Journal of Epidemiology. 2005; 34 :405–12. [ PubMed ] [ Google Scholar ]
  • Bateson P, Gluckman P. Plasticity and Robustness in Development and Evolution. International Journal of Epidemiology. 2012; 41 :219–23. [ PubMed ] [ Google Scholar ]
  • Benatar SR, Gill S, Bakker I. Global Health and the Global Economic Crisis. American Journal of Public Health. 2011; 101 :646–53. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Biersack A, Greenberg JB. Reimagining Political Ecology. Durham, NC: Duke University Press; 2006. [ Google Scholar ]
  • Birn AE, Pillay Y, Holtz TM. Textbook of International Health: Global Health in a Dynamic World. 3rd ed. New York: Oxford University Press; 2009. [ Google Scholar ]
  • Braatne JH, Rood SB, Heillman PE. Life History, Ecology, and Conservation of Riparian Cottonwoods in North America. In: Stettler RF, Bradshaw HD Jr, Heilman PE, Hinckley TM, editors. Biology of Populus and Its Implications for Management and Conservation. Ottawa: National Research Council of Canada, NRC Research Press; 1996. pp. 57–85. [ Google Scholar ]
  • Bray F, McCarron P, Parkin DM. The Changing Global Patterns of Female Breast Cancer Incidence and Mortality. Breast Cancer Research. 2004; 6 :229–39. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Brett KM, Madans JH. Difference in Use of Postmenopausal Hormone Replacement Therapy by Black and White Women. Menopause. 1997; 4 :66–70. [ Google Scholar ]
  • Briggs H. Women's Cancers Reach Two Million. 2011. BBC News Health, September 14. Available at http://www.bbc.co.uk/news/health-14917284 (accessed June 17, 2012)
  • Broadbent A. Inferring Causation in Epidemiology: Mechanisms, Black Boxes, and Contrasts. In: Illari PM, Russo F, Williamson J, editors. Causality in the Sciences. Oxford: Oxford University Press; 2011. pp. 45–69. [ Google Scholar ]
  • Bud R. Oxford Dictionary of National Biography. Oxford: Oxford University Press; 2004. Hogben, Lancelot Thomas (1895–1975) Available at http://www.oxforddnb.com.ezp-prod1.hul.harvard.edu/view/article/31244?docPos=1 (accessed June 17, 2012) [ Google Scholar ]
  • Burian RM, Zallen DT. Genes. In: Bowler PJ, Pickstone JV, editors. The Modern Biological and Earth Sciences. Cambridge: Cambridge University Press; 2009. Cambridge Histories Online. DOI: 10.1017/CHOL9780521572019.024 . [ Google Scholar ]
  • Butler AHB. Obituary: Major Greenwood. Journal of the Royal Statistical Society: Series A (General) 1949; 112 :487–89. [ Google Scholar ]
  • Carlson EA. The Unfit: A History of a Bad Idea. Cold Spring Harbor, NY: Cold Spring Harbor Press; 2001. [ Google Scholar ]
  • Cartwright N. Predicting “It Will Work for Us”: (Way) beyond Statistics. In: Illari PM, Russo F, Williamson J, editors. Causality in the Sciences. Oxford: Oxford University Press; 2011. pp. 750–68. [ Google Scholar ]
  • Carver T. Marx and Marxism. In: Porter TM, Ross D, editors. The Modern Social Sciences. Cambridge: Cambridge University Press; 2003. Cambridge Histories Online. DOI: 10.1017/CHOL9780521594424.013 . [ Google Scholar ]
  • Chen H-T, Rossi PH. The Theory-Driven Approach to Validity. Evaluation and Program Planning. 1987; 10 :95–103. [ Google Scholar ]
  • Clarke A, Agrò AF, Zheng Y, Tickle C, Jansson R, Kehrer-Sawatzki H, Cooper DN, Delves P, Battista J, Melino G, Perkel DJ, Hetherington AM, Bynum WF, Valpuesta JM, Harper D. Encyclopedia of Life Sciences. Chichester: Wiley; 2000. –2011. Available at http://www.els.net/WileyCDA/ (accessed September 6, 2011) [ Google Scholar ]
  • Coggon DIW, Martyn CN. Time and Chance: The Stochastic Nature of Disease Causation. The Lancet. 2005; 365 :1434–37. [ PubMed ] [ Google Scholar ]
  • Cole J. The Power of Large Numbers: Populations, Politics, and Gender in Nineteenth-Century France. Ithaca, NY: Cornell University Press; 2000. [ Google Scholar ]
  • Cook TD. Generalization: Conceptions in the Social Sciences. In: Smelser NJ, Baltes PB, editors. International Encyclopedia of the Social & Behavioral Sciences. Oxford: Pergamon; 2001. pp. 6037–43. DOI: 10.1016/B0-08-043076-7/00698-7 . [ Google Scholar ]
  • Cowan RS. Oxford Dictionary of National Biography. Oxford: Oxford University Press; 2004. Galton, Sir Francis (1822–1911) Available at http://www.oxforddnb.com.ezp-prod1.hul.harvard.edu/view/article/33315 (accessed June 17, 2012) [ Google Scholar ]
  • Crosby RA, Salazar LF, DiClemente RJ, Lang DL. Balancing Rigor against the Inherent Limitations of Investigating Hard-to-Reach Populations. Health Education Research. 2010; 25 :1–5. [ PubMed ] [ Google Scholar ]
  • Crow JF. R.A. Fisher: A Centennial View. Genetics. 1990; 124 :204–11. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Crow JF. Sewall Wright (1889–1988): A Biographical Memoir. Washington, DC: National Academy of Science; 1994. [ Google Scholar ]
  • Daintith J, Martin E, editors. A Dictionary of Science. 5th ed. Oxford: Oxford University Press; 2005. [ Google Scholar ]
  • Dale AI, Katz S. Arthur L. Bowley: A Pioneer in Modern Statistics and Economics. London: World Scientific Publishing; 2011. [ Google Scholar ]
  • Daniel TM. Wade Hampton Frost: Pioneer Epidemiologist 1880–1938. Rochester, NY: University of Rochester Press; 2004. [ Google Scholar ]
  • Darwin C. Origin of Species. Edison, NJ: Castle Books; 2004. (1859) [ Google Scholar ]
  • Daston LJ. Rational Individuals versus Laws of Society: From Probability to Statistics. In: Krüger L, Daston LJ, Heidelberger M, editors. The Probabilistic Revolution. Vol. 1 , Ideas in History. Cambridge, MA: MIT Press; 1987. pp. 295–304. [ Google Scholar ]
  • Davenport CB. Heredity in Relation to Eugenics. New York: Henry Holt; 1911. [ Google Scholar ]
  • Davey Smith G. Health Inequalities: Lifecourse Approaches. Bristol: Policy Press; 2003. [ Google Scholar ]
  • Davey Smith G. Epidemiology, Epigenetics and the “Gloomy Prospect”: Embracing Randomness in Population Health Research and Practice. International Journal of Epidemiology. 2011; 40 :537–62. [ PubMed ] [ Google Scholar ]
  • Davey Smith G. Epigenesis for Epidemiologists: Does Evo-Devo Have Implications for Population Health Research and Practice. International Journal of Epidemiology. 2012; 41 :236–47. [ PubMed ] [ Google Scholar ]
  • Davey Smith G, Morris J. A Conversation with Jerry Morris. Epidemiology. 2004; 15 :770–73. [ PubMed ] [ Google Scholar ]
  • Davis K, Rowland D. Uninsured and Underserved: Inequities in Health Care in the United States. The Milbank Quarterly. 1983; 61 :149–76. [ PubMed ] [ Google Scholar ]
  • Desrosières A. The Politics of Large Numbers: A History of Statistical Reasoning. Cambridge, MA: Harvard University Press; 1998. Trans. Camille Naish. [ Google Scholar ]
  • Dominguez-Bello MG, Blaser MJ. The Human Microbiota as a Marker for Migrations of Individuals and Populations. Annual Review of Anthropology. 2011; 40 :451–74. [ Google Scholar ]
  • Dunn BK, Agurs-Collins T, Browne D, Lubet R, Johnson KA. Health Disparities in Breast Cancer: Biology Meets Socioeconomic Status. Breast Cancer Research and Treatment. 2010; 121 :281–92. [ PubMed ] [ Google Scholar ]
  • Eldredge N. The Pattern of Evolution. New York: Freeman; 1999. [ Google Scholar ]
  • Eldredge N. Darwin: Discovering the Tree of Life. New York: Norton; 2005. [ Google Scholar ]
  • Eldredge N, Grene M. Interactions: The Biological Context of Social Systems. New York: Columbia University Press; 1992. [ Google Scholar ]
  • Evans RG, Barer ML, Marmor TR. Why Are Some People Healthy and Others Not? The Determinants of Health of Populations. New York: De Gruyter; 1994. [ Google Scholar ]
  • Falk R. The Gene—A Concept in Tension: A Critical Overview. In: Beurton PJ, Falk R, Rehinberger H-J, editors. The Concept of the Gene in Development and Evolution: Historical and Epistemological Perspectives. Cambridge: Cambridge University Press; 2000. pp. 317–49. [ Google Scholar ]
  • Fee E. Disease and Discovery: A History of the Johns Hopkins School of Hygiene and Public Health, 1916–1939. Baltimore: Johns Hopkins University Press; 1987. [ Google Scholar ]
  • Fergus C. Trees of New England: A Natural History. Guildford, CT: FalconGuide; 2005. [ Google Scholar ]
  • Ferlay J, Forman D, Mathers CD, Bray F. Re: “Breast and Cervical Cancer in 187 Countries between 1980 and 2010” The Lancet. 2012; 379 :1390–91. [ PubMed ] [ Google Scholar ]
  • Foner E, editor. The New American History. Rev. and expanded ed. Philadelphia: Temple University Press; 1997. [ Google Scholar ]
  • Forouzanafar MH, Foreman KJ, Delossantos AM, Lozano R, Lopez AD, Murray CJ, Naghanvi M. Breast and Cervical Cancer in 187 Countries between 1980 and 2010: A Systematic Analysis. The Lancet. 2011; 378 :1461–84. [ PubMed ] [ Google Scholar ]
  • Fox SE, Levitt P, Nelson CA., III How the Timing and Quality of Early Experiences Influence the Development of Brain Architecture. Child Development. 2010; 81 :28–40. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Friedman-Koss D, Crespo CJ, Bellantoni MF, Andersen RE. The Relationship of Race/Ethnicity and Social Class to Hormone Replacement Therapy: Results from the Third National Health and Nutrition Examination Survey 1988–1994. Menopause. 2002; 9 :264–72. [ PubMed ] [ Google Scholar ]
  • Frost C, Appel H, Carlson J, De Moraes CM, Mescher M, Schultz JC. Within-Plant Signaling by Volatiles Overcomes Vascular Constraints on Systemic Signaling and Primes Responses against Herbivores. Ecology Letters. 2007; 10 :490–98. [ PubMed ] [ Google Scholar ]
  • Frost WH. In: Epidemiology. Maxcy KF, editor. New York: Commonwealth Fund; 1941. pp. 439–52. (1927) In Papers of Wade Hampton Frost, M.D . [ Google Scholar ]
  • Frost WH. 1976. Some Conceptions of Epidemics in General. American Journal of Epidemiology. 1928; 103 :141–51. [ PubMed ] [ Google Scholar ]
  • Galton F. Natural Inheritance. London: Macmillan; 1889. [ Google Scholar ]
  • Galton F. Eugenics: Its Definition, Scope, and Aims. Nature. 1904; 70 :82. [ Google Scholar ]
  • Gaziano JM. The Evolution of Population Science: Advent of the Mega Cohort. JAMA. 2010; 304 :2288–89. [ PubMed ] [ Google Scholar ]
  • Gibson JJ. The Ecological Approach to Visual Perception. Hillsdale, NJ: Erlbaum; 1986. [ Google Scholar ]
  • Giddens A, Held D, editors. Classes, Power, and Conflict: Classical and Contemporary Debates. Berkeley: University of California Press; 1982. [ Google Scholar ]
  • Gilbert SF, Epel D. Ecological Developmental Biology: Integrating Epigenetics, Medicine, and Evolution. Sunderland, MA: Sinaeur Associates; 2009. [ Google Scholar ]
  • GLOBOCAN. 2012. Data Sources and Methods. International Agency for Research on Cancer, World Health Organization. Available at http://globocan.iarc.fr/ (accessed June 17, 2012)
  • Greenhalgh S. The Social Construction of Population Science: An Intellectual, Institutional, and Political History of Twentieth-Century Demography. Comparative Studies Society History. 1996; 38 :26–66. [ Google Scholar ]
  • Greenwood M. Epidemics and Crowd Diseases: An Introduction to the Study of Epidemiology. London: Williams & Norgate; 1935. [ Google Scholar ]
  • Grene M, Depew D. The Philosophy of Biology. Cambridge: Cambridge University Press; 2004. [ Google Scholar ]
  • Hacking I. The Emergence of Probability. Cambridge: Cambridge University Press; 1975. [ Google Scholar ]
  • Hacking I. The Taming of Chance. Cambridge: Cambridge University Press; 1990. [ Google Scholar ]
  • Hankins FH. Adolphe Quetelet as Statistician. New York: Arno Press; 1968. [ Google Scholar ]
  • Harraway DJ. When Species Meet. Minneapolis: University of Minnesota Press; 2008. [ Google Scholar ]
  • Harré R. Individual/Society: History of the Concept. In: Smelser NJ, Baltes PB, editors. International Encyclopedia of the Social & Behavioral Sciences. Oxford: Pergamon; 2001. pp. 7306–10. DOI: 10.1016/B0-08-043076-7/00125-X . [ Google Scholar ]
  • Harvey D. Justice, Nature, and the Geography of Difference. Cambridge, MA: Blackwell; 1996. [ Google Scholar ]
  • Heesterbeek H. The Law of Mass-Action in Epidemiology: A Historical Perspective. In: Cuddington K, Beisner BE, editors. Ecological Paradigms Lost: Routes of Theory Change. Burlington, MA: Elsevier Academic Press; 2005. pp. 81–106. [ Google Scholar ]
  • Heilbron J, Magnusson L, Wittrock B, editors. The Rise of the Social Sciences and the Formation of Modernity: Conceptual Change in Context, 1750–1850. Dordrecht: Kluwer Academic Publishers; 1998. [ Google Scholar ]
  • Hey J. Regarding the Confusion between the Population Concept and Mayr's “Population Thinking.” Quarterly Review of Biology. 2011; 86 :253–64. [ PubMed ] [ Google Scholar ]
  • Hodge J. Evolution. In: Bowler PJ, Pickstone JV, editors. The Modern Biological and Earth Sciences. Cambridge: Cambridge University Press; 2009. Cambridge Histories Online. DOI: 10.1017/CHOL9780521572019.015 . [ Google Scholar ]
  • Hogben L. Nature and Nurture. London: Williams & Norgate; 1933. [ Google Scholar ]
  • Hogben L. Major Greenwood: 1880–1949. Obituary Notices of Fellows of the Royal Society. 1950; 7 :138–54. [ Google Scholar ]
  • IHME (Institute for Health Metrics and Evaluation) The Challenge Ahead: Progress and Setbacks in Breast and Cervical Cancer. Seattle: 2011. [ Google Scholar ]
  • Illari PM, Russo F, Williamson J. Why Look at Causality in the Sciences? A Manifesto. In: Illari PM, Russo F, Williamson J, editors. Causality in the Sciences. Oxford: Oxford University Press; 2011. pp. 3–22. [ Google Scholar ]
  • Irwin A, Valentine N, Brown C, Loewenson R, Solar O, Brown H, Koller T, Vega J. The Commission on the Social Determinants of Health: Tackling the Social Roots of Health Inequities. PLoS Medicine. 2006; 3 (6):e106. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Issac J. The Human Sciences in Cold War America. Historical Journal. 2007; 50 :725–46. [ Google Scholar ]
  • Jansson S, Douglas CJ. Populus: A Model System for Plant Biology. Annual Review of Plant Biology. 2007; 58 :435–458. [ PubMed ] [ Google Scholar ]
  • Jaslow R. CBS News. 2011. Breast, Cervical Cancer Rates Rising around World: Why? September 15, 2011. Available at http://www.cbsnews.com/8301-504763_162-20106719-10391704.html (accessed June 17, 2012) [ Google Scholar ]
  • Keller EF. The Century of the Gene. Cambridge, MA: Harvard University Press; 2000. [ Google Scholar ]
  • Keller EF. The Mirage of a Space between Nature and Nurture. Durham, NC: Duke University Press; 2010. [ Google Scholar ]
  • Kermack WO, McKendrick AG. Contributions to the Mathematical Theory of Epidemics, Part I. Proceedings of the Royal Society Series A. 1927; 115 :700–721. [ Google Scholar ]
  • Kevels D. In the Name of Eugenics: Genetics and the Uses of Human Heredity. New York: Knopf; 1985. [ Google Scholar ]
  • Kincaid H. Causal Modeling, Mechanisms, and Probability in Epidemiology. In: Illari PM, Russo F, Williamson J, editors. Causality in the Sciences. Oxford: Oxford University Press; 2011. pp. 70–90. [ Google Scholar ]
  • Klassen AC, Smith KC. The Enduring and Evolving Relationship between Social Class and Breast Cancer Burden: A Review of the Literature. Cancer Epidemiology. 2011; 35 :217–34. [ PubMed ] [ Google Scholar ]
  • Kolata G. New York Times. 2006. Reversing Trend, Big Drop Is Seen in Breast Cancer. December 15. Available at http://www.nytimes.com/2006/12/15/health/15breast.html?pagewanted=all (accessed June 17, 2012) [ Google Scholar ]
  • Kolata G. New York Times. 2007. Sharp Drop in Rates of Breast Cancer Holds. April 19. Available at http://query.nytimes.com/gst/fullpage.html?res=9a03e6d91e3ff93aa25757c0a9619c8b63 (accessed June 17, 2012) [ Google Scholar ]
  • Krieger N. Epidemiology and the Web of Causation: Has Anyone Seen the Spider. Social Science & Medicine. 1994; 39 :887–903. [ PubMed ] [ Google Scholar ]
  • Krieger N. Epidemiology and Social Sciences: Towards a Critical Reengagement in the 21st Century. Epidemiology Review. 2000; 11 :155–63. [ PubMed ] [ Google Scholar ]
  • Krieger N. Theories for Social Epidemiology in the 21st Century: An Ecosocial Perspective. International Journal of Epidemiology. 2001; 30 :668–77. [ PubMed ] [ Google Scholar ]
  • Krieger N. Breast Cancer: A Disease of Affluence, Poverty, or Both?—The Case of African American Women. American Journal of Public Health. 2002; 92 :611–13. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Krieger N. Embodiment: A Conceptual Glossary for Epidemiology. Journal of Epidemiology & Community Health. 2005; 59 :350–55. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Krieger N. Ways of Asking and Ways of Living: Reflections on the 50th Anniversary of Morris’ Ever-Useful Uses of Epidemiology. International Journal of Epidemiology. 2007a; 36 :1173–80. [ PubMed ] [ Google Scholar ]
  • Krieger N. Why Epidemiologists Cannot Afford to Ignore Poverty. Epidemiology. 2007b; 18 :658–63. [ PubMed ] [ Google Scholar ]
  • Krieger N. Hormone Therapy and the Rise and Perhaps Fall of US Breast Cancer Incidence Rates: Critical Reflections. International Journal of Epidemiology. 2008; 37 :627–37. [ PubMed ] [ Google Scholar ]
  • Krieger N. Epidemiology and the People's Health: Theory and Context. New York: Oxford University Press; 2011. [ Google Scholar ]
  • Krieger N, Bassett M, Gomez S. Re: “Breast and Cervical Cancer in 187 Countries between 1980 and 2010.” The Lancet. 2012; 379 :1391–92. [ PubMed ] [ Google Scholar ]
  • Krieger N, Chen JT, Waterman PD. Decline in US Breast Cancer Rates after the Women's Health Initiative: Socioeconomic and Racial/Ethnic Differentials. American Journal of Public Health. 2010; 100 :S132–S139. erratum, 972. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Krieger N, Chen JT, Waterman PD. Temporal Trends in the Black/White Breast Cancer Case Ratio for Estrogen Receptor Status: Disparities Are Historically Contingent, Not Innate. Cancer Causes and Control. 2011; 22 :511–14. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Krieger N, Davey Smith G. Bodies Count & Body Counts: Social Epidemiology & Embodying Inequality. Epidemiology Review. 2004; 26 :92–103. [ PubMed ] [ Google Scholar ]
  • Krieger N, Fee E. Measuring Social Inequalities in Health in the United States: An Historical Review, 1900–1950. International Journal of Health Services. 1996; 26 :391–418. [ PubMed ] [ Google Scholar ]
  • Krieger N, Williams D, Moss N. Measuring Social Class in US Public Health Research: Concepts, Methodologies and Guidelines. Annual Review of Public Health. 1997; 18 :341–78. [ PubMed ] [ Google Scholar ]
  • Kuhlmann M. Mechanisms in Dynamically Complex Systems. In: Illari PM, Russo F, Williamson J, editors. Causality in the Sciences. Oxford: Oxford University Press; 2011. pp. 880–906. [ Google Scholar ]
  • Kukuall WA, Ganguli M. Generalizability: The Trees, the Forest, and the Low-Hanging Fruit. Neurology. 2012; 78 :1886–91. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kunitz SJ. The Health of Populations: General Theories and Particular Realities. New York: Oxford University Press; 2007. [ Google Scholar ]
  • Kuzawa C. Why Evolution Needs Development, and Medicine Needs Evolution. International Journal of Epidemiology. 2012; 41 :223–29. [ PubMed ] [ Google Scholar ]
  • Lane-Claypon JE. A Further Report on Cancer of the Breast with Special Reference to Its Associated Antecedent Conditions. Reports on Public Health and Medical Subjects no. 32. London: HMSO; 1926. [ Google Scholar ]
  • Lewontin R. The Triple Helix: Gene, Organism, and Environment. Cambridge, MA: Harvard University Press; 2000. [ Google Scholar ]
  • Lilienfeld AM, editor. Times, Places, and Persons: Aspects of the History of Epidemiology. Baltimore: Johns Hopkins University Press; 1980. [ Google Scholar ]
  • Limpert E, Stahel WA, Abbt M. Log-Normal Distributions across the Sciences: Keys and Clues. BioSci. 2001; 51 :341–52. [ Google Scholar ]
  • Mackenzie D. Statistics in Britain, 1865–1930: The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh University Press; 1982. [ Google Scholar ]
  • Martin J, Harré R. Metaphor in Science. In: Miall DS, editor. Metaphor: Problems and Perspectives. Sussex, NJ: Harvester Press; 1982. pp. 89–105. [ Google Scholar ]
  • Marx K. In: Theses on Feuerbach. Dietz JHW, editor. Stuttgart: 1845. 1888. First published, in an edited version, as an appendix to Engels F. Ludwig Feuerbach und der Ausgang der klassischen deutschen Philosophie. Mit Anghard: Karl Marx über Feuerbach von Jarhe 1845 . Available at http://www.marxists.org/archive/marx/works/1845/theses/index.htm (2002 trans. by Cyril Smith) (accessed June 17, 2012) [ Google Scholar ]
  • Mayr E. Towards a New Philosophy of Biology: Observations of an Evolutionist. Cambridge, MA: Harvard University Press; 1988. [ Google Scholar ]
  • Mendelsohn JA. From Eradication to Equilibrium: How Epidemics Became Complex after World War I. In: Lawrence C, Weisz G, editors. Greater Than the Parts: Holism in Biomedicine, 1920–1950. New York: Oxford University Press; 1998. pp. 303–31. [ Google Scholar ]
  • Mitchell M. Complexity: A Guided Tour. Oxford: Oxford University Press; 2009. [ Google Scholar ]
  • Morange M. The Misunderstood Gene. Cambridge, MA: Harvard University Press; 2001. [ Google Scholar ]
  • Morris JN. Uses of Epidemiology. Edinburgh: E. & S. Livingston; 1957. [ Google Scholar ]
  • Mountain JL. Human Evolutionary Genetics. In: Smelser NJ, Baltes PB, editors. International Encyclopedia of the Social & Behavioral Sciences. Oxford: Pergamon, Oxford; 2001. pp. 6984–91. DOI: 10.1016/B0-08-043076-7/03088-6 . [ Google Scholar ]
  • Nash K, Scott A, editors. The Blackwell Companion to Political Sociology. Malden, MA: Blackwell; 2001. [ Google Scholar ]
  • OED (Oxford English Dictionary) online. 2010. Draft revision June. Available at http://dictionary.oed.com.ezp-prod1.hul.harvard.edu/ (accessed June 17, 2012)
  • Parkin DM, Fernández LMG. Use of Statistics to Assess the Global Burden of Breast Cancer. Breast Journal. 2006; 12 (suppl. 1):S70– S80. [ PubMed ] [ Google Scholar ]
  • Pearce N. Epidemiology as a Population Science. International Journal of Epidemiology. 1999; 28 :S1015–S18. [ PubMed ] [ Google Scholar ]
  • Pflughoeft KJ, Versalovic J. Human Microbiome in Health and Disease. Annual Review of Pathology: Mechanisms of Disease. 2012; 7 :99–122. [ PubMed ] [ Google Scholar ]
  • Piermsa T, van Gils JA. The Flexible Phenotype: A Body-Centered Integration of Ecology, Physiology, and Behavior. New York: Oxford University Press; 2011. [ Google Scholar ]
  • Pizzi C, De Stavola B, Merletti F, Bellocco R, dos Santos Silva I, Pearce N, Richiardi L. Sample Selection and Validity of Exposure-Disease Association Estimates in Cohort Studies. Journal of Epidemiology & Community Health. 2011; 65 :407–11. [ PubMed ] [ Google Scholar ]
  • Porta M, editor. A Dictionary of Epidemiology. 5th ed. Oxford: Oxford University Press; 2008. [ Google Scholar ]
  • Porter TM. A Statistical Survey of Gases: Maxwell's Social Physics. Historical Studies in the Physical Sciences. 1981; 12 :77–116. [ Google Scholar ]
  • Porter TM. The Rise of Statistical Thinking, 1820–1900. Princeton, NJ: Princeton University Press; 1986. [ Google Scholar ]
  • Porter TM. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton, NJ: Princeton University Press; 1995. [ Google Scholar ]
  • Porter TM. Statistics and Physical Theories. In: Nye MJ, editor. The Modern Physical and Mathematical Sciences. Cambridge: Cambridge University Press; 2002. Cambridge Histories Online. DOI: 10.1017/CHOL9780521571999.027 . [ Google Scholar ]
  • Porter TM. Statistics and Statistical Methods. In: Porter TM, Ross D, editors. The Modern Social Sciences. Cambridge: Cambridge University Press; 2003. Cambridge Histories Online. DOI: 10.1017/CHOL9780521594424.015 . [ Google Scholar ]
  • Press DJ, Pharoah P. Risk Factors for Breast Cancer: A Reanalysis of Two Case-Control Studies from 1926 and 1931. Epidemiology. 2010; 21 :566–72. [ PubMed ] [ Google Scholar ]
  • Quetelet A. In: Sur l’homme et le development des ses facultés, ou essai de physique sociale. Knox R, translator. 1835. Paris. For a translation, see Quetelet, A. (1842) 1968. A Treatise on Man and the Development of His Faculties . Reprint, New York: Burt Franklin. [ Google Scholar ]
  • Quetelet A. Recherches statistiques. Brussels: M. Hayez (Imprimeur de la Commission centrale de statistique); 1844. [ Google Scholar ]
  • Ramsden E. Carving Up Population Science: Eugenics, Demography and the Controversy over the “Biological Law” of Population Growth. Social Studies of Science. 2002; 32 :857–99. [ Google Scholar ]
  • Ravdin PM, Cronin KA, Howlader N, Berg CD, Chlebowski RT, Feuer EJ, Edwards BK, Berry DA. The Decrease in Breast-Cancer Incidence in 2003 in the United States. New England Journal of Medicine. 2007; 356 :1670–74. [ PubMed ] [ Google Scholar ]
  • Ravdin PM, Cronin KA, Howlader N, Chlebowski RT, Berry DA. A Sharp Decrease in Breast Cancer Incidence in the United States in 2003. Breast Cancer Research and Treatment. 2006; 100 (suppl) S2 (abstract) [ Google Scholar ]
  • Relton CL, Davey Smith G. Is Epidemiology Ready for Epigenetics? International Journal of Epidemiology. 2012; 41 :5–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Richards RA. Encyclopedia of Life Sciences. New York: Wiley; 2001. Species Problem—A Philosophical Analysis. (online 2007). DOI: 10.1002/9780470015902.a0003456 . [ Google Scholar ]
  • Rose D, Pevalin DJ, editors. A Researcher's Guide to the National Statistics Socio-economic Classification. London: Sage; 2003. [ PubMed ] [ Google Scholar ]
  • Rose GA. Sick Individuals and Sick Populations. International Journal of Epidemiology. 1985; 14 :32–38. [ PubMed ] [ Google Scholar ]
  • Rose GA. The Strategy of Preventive Medicine. Oxford: Oxford University Press; 1992. [ Google Scholar ]
  • Rose GA. Rose's Strategy of Preventive Medicine: The Complete Original Text, with a Commentary by Kay-Tee Khaw and Michael Marmot. Oxford: Oxford University Press; 2008. [ Google Scholar ]
  • Rosen G. A History of Public Health. Baltimore: Johns Hopkins University Press; 1993. (1958) Expanded ed. Introduction by E. Fee; biographical essay and new bibliography by E.T. Morman. [ Google Scholar ]
  • Ross D. Changing Contours of the Social Science Disciplines. In: Porter TM, Ross D, editors. The Modern Social Sciences. Cambridge: Cambridge University Press; 2003. pp. 275–305. [ Google Scholar ]
  • Rossouw JE, Anderson GL, Prentice RL, LaCroix AZ, Kooperberg C, Stefanick ML, Jackson RD, Beresford SA, Howard BV, Johnson KC, Kotchen JM, Ockene J, Writing Group for the Women's Health Initiative Investigators Risk and Benefits of Estrogen plus Progestin in Healthy Postmenopausal Women: Principal Results from the Women's Health Initiative Randomized Controlled Trial. JAMA. 2002; 288 :321–33. [ PubMed ] [ Google Scholar ]
  • Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. [ Google Scholar ]
  • Sarkar S. Lancelot Hogben, 1895–1975. Genetics. 1996; 142 :655–60. [ Google Scholar ]
  • Schank JC, Twardy C. Mathematical Models. In: Bowler PJ, Pickstone JV, editors. The Modern Biological and Earth Sciences. Cambridge: Cambridge University Press; 2009. Cambridge Histories Online. DOI: 10.1017/CHOL9780521572019.023 . [ Google Scholar ]
  • Schlesselman JJ, Stadel BV. Exposure Opportunity in Epidemiologic Studies. American Journal of Epidemiology. 1987; 125 :174–78. [ PubMed ] [ Google Scholar ]
  • Scott J, Marshall G, editors. A Dictionary of Sociology. 3rd ed. Oxford: Oxford University Press; 2005. [ Google Scholar ]
  • Shaghaghi A, Bhopal RJ, Sheik A. Approaches to Recruiting “Hard-to-Reach” Populations in Research: Review of the Literature. Health Promotion Perspectives. 2011; 1 (2):1–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Smith GD. The Uses of “Uses of Epidemiology.” International Journal of Epidemiology. 2001; 30 :1146–55. [ PubMed ] [ Google Scholar ]
  • Smylie J, Lofters A, Firestone M, O’Campo P. Population-Based Data and Community Empowerment. In: O’Campo P, Dunn JR, editors. Rethinking Social Epidemiology: Towards a Science of Change. Dordrecht: Springer Science+Business Media B.V; 2012. pp. 68–92. [ Google Scholar ]
  • Stanley D, Phelps AE, Banaji MR. The Neural Basis of Implicit Attitudes. Current Directions in Psychological Science. 2008; 17 :165–70. [ Google Scholar ]
  • Steinman E. Sovereigns and Citizens? The Contested Status of American Indian Tribal Nations and Their Members. Citizenship Studies. 2011; 15 :57–74. [ Google Scholar ]
  • Stigler SM. The History of Statistics: The Measurement of Uncertainty before 1900. Cambridge, MA: Belknap Press /Harvard University Press; 1986. [ Google Scholar ]
  • Stigler SM. Regression towards the Mean, Historically Considered. Statistical Methods in Medical Research. 1997; 6 :103–14. [ PubMed ] [ Google Scholar ]
  • Stigler SM. The Average Man Is 168 Years Old. In: Stigler SM, editor. Statistics on the Table: The History of Statistical Concepts and Methods. Cambridge, MA: Harvard University Press; 2002. pp. 51–65. [ Google Scholar ]
  • Stiglitz J. Freefall: America, Free Markets, and the Sinking World Economy. New York: Norton; 2010. [ Google Scholar ]
  • Strevens M. Bigger Than Chaos: Understanding Complexity through Probability. Cambridge, MA: Harvard University Press; 2003. [ Google Scholar ]
  • Susser M, Stein Z. Eras in Epidemiology: The Evolution of Ideas. New York: Oxford University Press; 2009. [ Google Scholar ]
  • Svensson P-G. Special Issue: Health Inequities in Europe. Social Science & Medicine. 1990; 31 :225–27. [ PubMed ] [ Google Scholar ]
  • Sydenstricker E. Health and Environment. New York: McGraw-Hill; 1933. [ Google Scholar ]
  • Tabery J. R.A. Fisher, Lancelot Hogben, and the Origin(s) of Genotype-Environment Interaction. Journal of the History of Biology. 2008; 41 :717–61. [ PubMed ] [ Google Scholar ]
  • Turner JH. A New Approach for Theoretically Integrating Micro and Macro Analyses. In: Calhoun C, Rojek C, Turner B, editors. The Sage Handbook of Sociology. Thousand Oaks, CA: Sage; 2005. pp. 405–22. [ Google Scholar ]
  • U.S. Citizenship and Immigration Services. 2012. Citizenship. Available at http://www.uscis.gov/portal/site/uscis/ (accessed June 17, 2012)
  • Vona-Davis L, Rose DP. The Influence of Socioeconomic Disparities on Breast Cancer Tumor Biology and Prognosis: A Review. Journal of Women's Health. 2009; 18 :883–93. [ PubMed ] [ Google Scholar ]
  • Wainwright JM. A Comparison of Conditions Associated with Breast Cancer in Great Britain and America. American Journal of Cancer. 1931; 15 :2610–45. [ Google Scholar ]
  • Wallace TA, Martin DN, Ambs S. Interactions among Genes, Tumor Biology and the Environment in Cancer Health Disparities: Examining the Evidence on a National and Global Scale. Carcinogenesis. 2011; 32 :1107–21. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Walter J, Ley R. The Human Gut Microbiome: Ecology and Recent Evolutionary Changes. Annual Review of Microbiology. 2011; 65 :411–29. [ PubMed ] [ Google Scholar ]
  • Weiss KM, Long JC. Non-Darwinian Estimation: My Ancestors, My Genes’ Ancestors. Genome Research. 2009; 19 :703–10. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Werskey G. In: The Visible College: A Collective Biography of British Scientists and Socialists of the 1930s. Young RM, editor. London: Free Association Books; 1988. Foreword by. [ Google Scholar ]
  • West-Eberhard MT. Developmental Plasticity and Evolution. New York: Oxford University Press; 2003. [ Google Scholar ]
  • Whitehead M. The Concepts and Principles of Equity and Health. International Journal of Health Services. 1992; 22 :429–45. [ PubMed ] [ Google Scholar ]
  • WHO (World Health Organization) Closing the Gap in a Generation: Health Equity through Action on the Social Determinants of Health. 2008. Commission on the Social Determinants of Health—Final Report. Geneva. Available at http://www.who.int/social_determinants/thecommission/finalreport/en/index.html (accessed June 17, 2012) [ PubMed ] [ Google Scholar ]
  • WHO (World Health Organization) 2011. Rio Political Declaration on Social Determinants of Health. Rio de Janeiro, October 21. Available at http://www.who.int/sdhconference/declaration/en/index.html (accessed June 17, 2012)
  • Wiehl DG. Edgar Sydenstricker: A Memoir. In: Kasius RV, editor. The Challenge of the Facts: Selected Public Health Papers of Edgar Sydenstricker. New York: Prodist, for the Milbank Memorial Fund; 1974. pp. 1–17. [ Google Scholar ]
  • Williams R. Keywords: A Vocabulary of Culture and Society. Rev. ed. New York: Oxford University Press; 1985. [ Google Scholar ]
  • Wimmer A, Schiller NG. Methodological Nationalism and Beyond: Nation-State, Migration, and the Social Sciences. Global Networks. 2002; 4 :301–34. [ Google Scholar ]
  • Winkelstein W., Jr . Oxford Dictionary of National Biography. Oxford: Oxford University Press; 2004. Claypon, Janet Elizabeth Lane- [married name Janet Elizabeth Forber, Lady Forber] (1877–1967) Available at http://www.oxforddnb.com.ezp-prod1.hul.harvard.edu/view/article/61714 (accessed June 17, 2012) [ Google Scholar ]
  • Winslow C-EA, Smillie WG, Doull JA, Gordon JE. In: The History of American Epidemiology. Top FH, editor. Mosby; 1952. Sponsored by the Epidemiology Section, American Public Health Association. St. Louis. [ Google Scholar ]
  • Wright EO, editor. Approaches to Class Analysis. Cambridge: Cambridge University Press; 2005. [ Google Scholar ]
  • Wright S. The Relative Importance of Heredity and Environment in Determining the Pie-Bald Pattern of Guinea-Pigs. 1920; 6 :320–32. Proceedings of the National Academy of Sciences . [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yeo EJ. Social Surveys in the Eighteenth and Nineteenth Centuries. In: Porter TM, Ross D, editors. The Modern Social Sciences. Cambridge: Cambridge University Press; 2003. Cambridge Histories Online. DOI: 10.1017/CHOL9780521594424.007 . [ Google Scholar ]
  • Young TK. Population Health: Concepts and Methods. 2nd ed. New York: Oxford University Press; 2005. [ Google Scholar ]
  • Zbuk K, Anand SS. Declining Incidence of Breast Cancer after Decreased Use of Hormone-Replacement Therapy: Magnitude and Time Lags in Different Countries. Journal of Epidemiology & Community Health. 2012; 66 :1–7. [ PubMed ] [ Google Scholar ]
  • Ziman J. Real Science: What It Is and What It Means. Cambridge: Cambridge University Press; 2000. [ Google Scholar ]
  • Zinn H. A People's History of the United States: 1492–Present. New York: HarperCollins; 2003. [ Google Scholar ]

An equitable and sustainable future for everyone, everywhere

We transform global thinking on critical health and development issues through social science, public health, and biomedical research.

Featured insights

The girl agenda cannot wait collaboration and multi-dimensional investments needed for adolescent girls' empowerment.

Reflections from events at the 68th Commission on the Status of Women (CSW68) in New York.

research work on population

A Turning Point in Mexican Law: Insights into the Supreme Court Orders to Decriminalize Abortion at the Federal Level

Although the Supreme Court’s decision to decriminalize abortion is a major step toward comprehensive sexual and reproductive health care, legal and non-legal barriers still exist that restrict abortion access for women and people with the capacity for pregnancy in Mexico.

research work on population

Supporting Women in their Public Health Careers

A Q&A with Population Council's WomenLift Health Leadership Journey Participants.

Latest News

  • The Coming Birth-Control Revolution
  • A fond farewell to Dr. Harriet Birungi
  • The Abortion Pill on Trial

Focus Areas

Sexual and reproductive health, rights, and choices.

artwork for SRHR focus area

Adolescents and Young People

artwork for Adolescents focus area

Gender Equality and Equity

artwork for Gender equality focus area

Climate and Environmental Changes

artwork for Climate Change focus area

Innovation Hubs

CBR logo

Center for Biomedical Research (CBR)

Developing and ensuring access to innovative and affordable products that promote sexual and reproductive health, rights, and choices

GIRL Center logo

Girl Innovation, Research, and Learning (GIRL) Center

Envisioning a gender-equitable world where girls and boys make a healthy and safe transition into adulthood and reach their full potential

Humanitarian Task Force logo

Humanitarian Task Force (HTF)

Conducting cutting-edge research to produce effective solutions for people affected by complex emergencies, natural disasters, and post-conflict crises

Population, Environmental Risks, and the Climate Crisis (PERCC) Initiative

Generating ideas and developing sustainable and equitable solutions to pursue justice in the face of climate and environmental change

Gender, Education, Justice, and Equity (GEJE)

Shaping a just world: Building foundational skills, equity, and critical thinking for all children

Subscribe to our newsletter

The Population Council is a leading research organization dedicated to building an equitable and sustainable world that enhances the health and well-being of current and future generations. We generate ideas, produce evidence, and design solutions to improve the lives of underserved populations around the world.

2023 The Population Council, Inc. All rights reserved.

medRxiv

Corporate activities that influence population health: A scoping review and qualitative synthesis to develop the HEALTH-CORP typology

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Raquel Burgess
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

Introduction: The concept of the commercial determinants of health (CDH) is used to study the actions (and associated structures) of commercial entities that influence population health and health equity. The aim of this study was to develop a typology that describes the diverse set of activities through which corporations influence population health and health equity across industries. Methods: We conducted a scoping review of articles using CDH terms (n=116) that discuss corporate activities that can influence population health and health equity across 16 industries. We used the qualitative constant comparison method to build a typology called the Corporate Influences on Population Health (HEALTH-CORP) typology. Results: The HEALTH-CORP typology identifies 70 corporate activities that can influence health across industries and categorizes them into seven domains of corporate influence (e.g., political practices, employment practices). We present a model that situates these domains based on their proximity to health outcomes and identify five population groups (e.g., workers, local communities) to consider when evaluating corporate health impacts. Discussion: The HEALTH-CORP typology facilitates an understanding of the diverse set of corporate activities that can influence population health and the population groups affected by these activities. We discuss the utility of these contributions in terms of identifying interventions to address the CDH and advancing efforts to measure and monitor the CDH. We also leverage our findings to identify key gaps in CDH literature and suggest avenues for future research.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Raquel Burgess was supported by a Doctoral Foreign Study Award provided by the Canadian Institutes of Health Research at the time this research was conducted. Funding was provided by the Yale School of Public Health and the Yale Graduate Student Assembly to present this work at the American Public Health Association Annual Meeting in 2022.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Data Availability

The data for this study are published academic articles which are available from the respective publishers (see Supplementary Material, Appendix 2 for the characteristics of included articles). In addition, we uploaded the following files to Open Science Framework (DOI 10.17605/OSF.IO/TG9S7) to support data availability: 1) a .csv file containing a list of the articles that underwent title and abstract screening in our study and the respective screening decisions that were assigned, and 2) .ris files containing the citations to the respective articles and the assigned screening decisions, which can be uploaded into a reference manager. Interested parties can contact the corresponding author for additional information.

https://osf.io/tg9s7/

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Addiction Medicine (316)
  • Allergy and Immunology (620)
  • Anesthesia (160)
  • Cardiovascular Medicine (2284)
  • Dentistry and Oral Medicine (280)
  • Dermatology (201)
  • Emergency Medicine (370)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (805)
  • Epidemiology (11591)
  • Forensic Medicine (10)
  • Gastroenterology (681)
  • Genetic and Genomic Medicine (3600)
  • Geriatric Medicine (337)
  • Health Economics (618)
  • Health Informatics (2311)
  • Health Policy (916)
  • Health Systems and Quality Improvement (865)
  • Hematology (335)
  • HIV/AIDS (753)
  • Infectious Diseases (except HIV/AIDS) (13170)
  • Intensive Care and Critical Care Medicine (758)
  • Medical Education (360)
  • Medical Ethics (100)
  • Nephrology (391)
  • Neurology (3368)
  • Nursing (191)
  • Nutrition (508)
  • Obstetrics and Gynecology (652)
  • Occupational and Environmental Health (647)
  • Oncology (1764)
  • Ophthalmology (526)
  • Orthopedics (210)
  • Otolaryngology (284)
  • Pain Medicine (223)
  • Palliative Medicine (66)
  • Pathology (441)
  • Pediatrics (1008)
  • Pharmacology and Therapeutics (422)
  • Primary Care Research (407)
  • Psychiatry and Clinical Psychology (3074)
  • Public and Global Health (6004)
  • Radiology and Imaging (1226)
  • Rehabilitation Medicine and Physical Therapy (715)
  • Respiratory Medicine (811)
  • Rheumatology (367)
  • Sexual and Reproductive Health (356)
  • Sports Medicine (318)
  • Surgery (390)
  • Toxicology (50)
  • Transplantation (171)
  • Urology (142)

IMAGES

  1. (PDF) CONCEPT OF POPULATION AND SAMPLE

    research work on population

  2. 7 Analysis Techniques for Small Population Research

    research work on population

  3. Population

    research work on population

  4. ≫ Human Population Growth Free Essay Sample on Samploon.com

    research work on population

  5. Research Target Population Ppt Powerpoint Presentation Summary Format

    research work on population

  6. 🎉 Population and poverty essay. Over population and poverty. 2022-10-26

    research work on population

VIDEO

  1. Bayford Thrust Exploring The Old Abandoned Leeds Terminal January 2024

  2. Populations Health Research and Precision Medicine: How They Interact to Define Individual Risk

COMMENTS

  1. What Is the Big Deal About Populations in Research?

    In research, there are 2 kinds of populations: the target population and the accessible population. The accessible population is exactly what it sounds like, the subset of the target population that we can easily get our hands on to conduct our research. While our target population may be Caucasian females with a GFR of 20 or less who are ...

  2. Population vs. Sample

    The research population, also known as the target population, refers to the entire group or set of individuals, objects, or events that possess specific characteristics and are of interest to the researcher. It represents the larger population from which a sample is drawn. The research population is defined based on the research objectives and ...

  3. Research Fundamentals: Study Design, Population, and Sample Size

    design, population of interest, study setting, recruit ment, and sampling. Study Design. The study design is the use of e vidence-based. procedures, protocols, and guidelines that provide the ...

  4. Statistics without tears: Populations and samples

    Research workers in the early 19th century endeavored to survey entire populations. This feat was tedious, and the research work suffered accordingly. Current researchers work only with a small portion of the whole population (a sample) from which they draw inferences about the population from which the sample was drawn.

  5. Population Studies at 75 years: An empirical review

    Introduction. For 75 years, the journal Population Studies has published work advancing our knowledge of demography and population, from substantive topics in the areas of fertility, mortality, migration, and families to innovations in theory, methods, policy, and practice. Demographic topics, theories, and methods have drawn from multiple disciplines, ranging from economics to sociology ...

  6. Working with population data

    A population is the set of all cases that are eligible and relevant, and that had a genuine chance of taking part in the research. If all of these cases are involved, or are invited to be involved, in the research then the study is of a population (rather than of a 'sample'). It is a kind of census. Use the code MSPACEQ323 for a 20% discount.

  7. 7 Samples and Populations

    So if you want to sample one-tenth of the population, you'd select every tenth name. In order to know the k for your study you need to know your sample size (say 1000) and the size of the population (75000). You can divide the size of the population by the sample (75000/1000), which will produce your k (750).

  8. Understanding Population in Scientific Research: A Comprehensive

    Explore the concept of population in scientific research and learn how to define and generalize findings to larger groups. Gain insights into sampling, generalizability, and the importance of population in study design ... Social Work: Transform Lives, Impact Communities. Methodologists Jul 19, 2023 0 1188. Exploring the Intricate Connections ...

  9. Population vs. Sample

    A population is the entire group that you want to draw conclusions about.. A sample is the specific group that you will collect data from. The size of the sample is always less than the total size of the population. In research, a population doesn't always refer to people. It can mean a group containing elements of anything you want to study, such as objects, events, organizations, countries ...

  10. Study Population

    Study population is a subset of the target population from which the sample is actually selected. It is broader than the concept sample frame.It may be appropriate to say that sample frame is an operationalized form of study population. For example, suppose that a study is going to conduct a survey of high school students on their social well-being. ...

  11. Full article: Looking to the future of Population Studies

    From its inception in 1946, Population Studies has taken a broad view of demography, reflecting the outlook of its founding editor, David Glass, and carried forward during its first 50 years by Eugene Grebenik. The aim of its 50th anniversary issue in 1996 was to describe developments in demographic research during its first 50 years of existence.

  12. Defining and Identifying Members of a Research Study Population: CTSA

    The defined population then will become the basis for applying the research results to other relevant populations. Clearly defining a study population early in the research process also helps assure the overall validity of the study results. Many research reports fail to define or describe a study population adequately.

  13. Research Guides: Human Geography: Population studies

    Such work could often be distinguished from population studies in general by its use of smaller scale data, below national level. ... The Population Research Institute is a non-profit research organization whose core values hold that people are the world's greatest resource. PRI's goals are to educate on this premise, to expose the myth of ...

  14. Major Trends in Population Growth Around the World

    The world's population continues to grow, reaching 7.8 billion by mid-2020, rising from 7 billion in 2010, 6 billion in 1998, and 5 billion in 1986. The average annual growth rate was around 1.1% in 2015-2020, which steadily decreased after it peaked at 2.3% in the late 1960s.

  15. Population Growth

    Population growth is one of the most important topics we cover at Our World in Data. For most of human history, the global population was a tiny fraction of what it is today. Over the last few centuries, the human population has gone through an extraordinary change. In 1800, there were one billion people. Today there are more than 8 billion of us.

  16. PDF Understanding Research Methods, Populations and Sampling

    • The model of the research process, (see the following slide), shows that, at this stage in the research process we have decided on the research methodology to be used in the research project, and we have now come to the stage of defining the population of the research, deciding whether to work with the entire population or with a sample

  17. Population Research

    3.09.2.5 Minnesota Population Center (MPC) The Minnesota Population Center is an interdisciplinary cooperative for demographic research. It focuses especially on population data science and on census and survey methodology. In addition, MPC conducts research on population mobility, reproductive and sexual health, and work, family, and time.

  18. Research Population

    Research Population. All research questions address issues that are of great relevance to important groups of individuals known as a research population. A research population is generally a large collection of individuals or objects that is the main focus of a scientific query. It is for the benefit of the population that researches are done.

  19. PDF Describing Populations and Samples in Doctoral Student Research

    The sampling frame intersects the target population. The sam-ple and sampling frame described extends outside of the target population and population of interest as occa-sionally the sampling frame may include individuals not qualified for the study. Figure 1. The relationship between populations within research.

  20. Samples & Populations in Research

    Tell your students that you will read a scenario and they must decide on whether the research scenario relates to a population or a sample. If it is a sample, they must identify the type of sample ...

  21. Study Population: Characteristics & Sampling Techniques

    A study population is a group considered for a study or statistical reasoning. The study population is not limited to the human population only. It is a set of aspects that have something in common. They can be objects, animals, measurements, etc., with many characteristics within a group. For example, suppose you are interested in the average ...

  22. Sampling Methods

    The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population). Example: Sampling frame You are doing research on working conditions at a social media marketing company. Your population is all 1000 employees of the ...

  23. Who and What Is a Population?

    Methods. In this article, I review the current conventional definitions of, and historical debates over, the meaning(s) of "population," trace back the contemporary emphasis on populations as statistical rather than substantive entities to Adolphe Quetelet's powerful astronomical metaphor, conceived in the 1830s, of l'homme moyen (the average man), and argue for an alternative definition ...

  24. (PDF) CONCEPT OF POPULATION AND SAMPLE

    Abstract. This paper deals with the concept of Population and Sample in research, especially in educational and psychological researches and the researches carried out in the field of Sociology ...

  25. Population Council

    The Population Council is a leading research organization dedicated to building an equitable and sustainable world that enhances the health and well-being of current and future generations. We generate ideas, produce evidence, and design solutions to improve the lives of underserved populations around the world.

  26. Corporate activities that influence population health: A scoping review

    Introduction: The concept of the commercial determinants of health (CDH) is used to study the actions (and associated structures) of commercial entities that influence population health and health equity. The aim of this study was to develop a typology that describes the diverse set of activities through which corporations influence population health and health equity across industries ...